Description Usage Arguments Details Value Author(s) References Examples
Survgini is a nonparametric test for the equality of two survival distributions from the point of view of concentration. Two alternative tests are available, the asymptotic Gini test and the permutation test; the latter should be preferred to the asymptotic test in the case of unbalanced and small groups. The function Survgini compares different nonparametric tests (asymptotic Gini test, permutation Gini test, log-rank test, Gray-Tsiatis test and Wilcoxon test) and computes their p-values.
1 2 |
asymptotic |
a logical value indicating whether the asymptotic Gini test should be computed; the default is TRUE. |
permutation |
a logical value indicating whether the permutation Gini test should be computed; the default is TRUE. |
M |
the number of replications of permutation sampling; default is 500. |
linearRank |
a logical value indicating whether the linear rank tests (log-rank test, Gray-Tsiatis test and Wilcoxon test) should be computed; the default is TRUE. |
lastEvent |
represents the longest follow-up time until which we integrate the restricted Gini index. Choosing lastEvent =1 means integrating the restricted Gini statistic until the last event non-censored, while lastEvent =0 means integrating the restricted Gini statistic until the last observation censored or not. |
datasetOrig |
it must be a 3-columns dataset. The first column refers to the time-to-event, the second column is a censor indicator (=0 if the observation is right-censored, =1 if not), the third column refers to the treatment group (=1 for the first group, and =2 for the second group). |
The Gini index is one of the most common statistical indices employed in social sciences for measuring concentration in the distribution of a positive random variable (typically, income). This concentration index may be useful in detecting differences in heterogeneity also between the survival distributions of two groups of individuals to whom different treatments have been administered, thus providing a distinct assessment of such difference compared to what is provided by traditional tests.
In Bonetti, Gigliarano, and Muliere (2009) a nonparametric test has been proposed based on the Gini index for testing the equality of two survival distributions from the point of view of concentration. The authors have derived the asymptotic distribution of the test statistics and performed a simulation analysis, in which they compared the Gini test with other tests for the difference between two survival distributions, such as the log-rank, Wilcoxon, and Gray-Tsiatis tests.
Gigliarano and Bonetti (2011) have compared the asymptotic Gini test with a permutation test, suggesting that the permutation test should be preferred to the asymptotic test in the case of unbalanced and small groups.
The function Survgini compares the Gini test based on asymptotic and on permutation inference with three other tests (log-rank, Wilcoxon, and Gray-Tsiatis) and computes their p-values.
The function Survgini prints a list containing the following components:
pGiniAs |
the p-value of the asymptotic Gini test (if the argument asymptotic=TRUE is chosen) |
pGiniPerm |
the p-value of the permutation Gini test (if the argument permutation=TRUE is chosen) |
pGT |
the p-value of the Gray-Tsiatis test (if the argument linearRank=TRUE is chosen) |
pLR |
the p-value of the Log-Rank test (if the argument linearRank=TRUE is chosen) |
pW |
the p-value of the Wilcoxon test (if the argument linearRank=TRUE is chosen) |
Chiara Gigliarano c.gigliarano@univpm.it, Marco Bonetti marco.bonetti@unibocconi.it. Special thanks to Wai-ki Yip for his helpful assistance in the preparation of the package.
Bonetti, M., Gigliarano, C. and Muliere, P. (2009). The Gini concentration test for survival data. Lifetime Data Analysis, 15, 493-518.
Gigliarano, C. and Bonetti, M. (2011). The Gini test for survival data in presence of small and unbalanced groups. In press, Biomedical Statistics and Clinical epidemiology.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 | # Use the dataset aml on survival in patients with Acute Myelogenous Leukemia.
# The dataset contains the following variables:
# time: survival or censoring time,
# status: censoring status,
# x: maintenance chemotherapy given.
library(survival)
attach(aml)
# We first make data compatible with the SurvGini:
N<-length(aml[,1])
aml2<-matrix(nrow=N, ncol=3)
aml2[,1]<-time
aml2[,2]<-status
for (i in 1:N){
if (aml[i,3]=="Maintained") aml2[i,3]<-1 else aml2[i,3]<-2
}
Survgini(asymptotic=TRUE, permutation=TRUE, M=500, lastEvent=1, datasetOrig=aml2)
## The function is currently defined as
function(asymptotic=TRUE, permutation=TRUE, M=500, linearRank=TRUE, lastEvent=1, datasetOrig){
library(survival)
VarGinicensor<-function(data, Tmax){
data1<-data.frame(data)
info<- survfit(Surv(data1[,1], data1[,2])~1 , type='kaplan-meier', data=data1)
S<-matrix(info$surv)
T<-matrix(info$time)
indices <- sum(1*(T<=Tmax))
if (indices==0){
var<-0
return(var=var)
}
else{
Vt<-matrix(ncol=1,nrow=indices)
Vt[1]<-1*T[1]
if(indices>=2){
for (i in 2:indices){
Vt[i]<-Vt[i-1]+((S[i-1])^2)*(T[i]-T[i-1])
}
}
lastpiecesVt<-((S[indices])^2)*(Tmax-T[indices])
VtMax<-Vt[indices]+lastpiecesVt
Wt<-matrix(ncol=1,nrow=indices)
Wt[1]<-1*T[1]
if(indices>=2){
for (i in 2:indices){
Wt[i]<-Wt[i-1]+S[i-1]*(T[i]-T[i-1])
}
}
lastpiecesWt<-(S[indices])*(Tmax-T[indices])
WtMax<-Wt[indices]+lastpiecesWt
mu2<-matrix(ncol=1,nrow=indices)
mu2<-VtMax-Vt
mu<-matrix(ncol=1,nrow=indices)
mu<-WtMax-Wt
n<-length(data[,1])
event<-matrix( ncol=1, nrow=indices)
atrisk<-matrix(ncol=1, nrow=indices)
for (i in 1:indices){
event[i]<-info$n.event[i]
atrisk[i]<-info$n.risk[i]
}
dsigma<-matrix(0, ncol=1,nrow=indices)
for (i in 1:indices){
if (atrisk[i]>0)
dsigma[i]<-(n*event[i])/(atrisk[i]^2)
}
varistant<-matrix(ncol=1, nrow=indices)
for (i in 1:indices){
varistant[i]<-(4*exp(2*log(mu2[i])-2*log(WtMax))+exp(2*log(mu[i])+
2*log(VtMax)-4*log(WtMax))-4*exp(log(mu[i])+log(mu2[i])+log(VtMax)-
3*log(WtMax)))*(dsigma[i])
}
var<-matrix(ncol=1,nrow=indices)
var[1]<-varistant[1]
if(indices>=2){
for (i in 2:indices){
var[i]<-var[i-1]+varistant[i]
}
}
return(var[indices]/n)
}
}
Gcensor2<-function(data,Tmax=max(data[,1])){
data1 <- data.frame(data)
info <- survfit(Surv(data[,1], data[,2])~1 , type='kaplan-meier', data=data1)
K<-length(info$time)
S<-matrix(info$surv, ncol=1, nrow=K)
num<-matrix(ncol=1,nrow=K)
T<-matrix(info$time,ncol=1, nrow=K)
num[1]<-1*T[1]
if (K>=2){
for (i in 2:K)
num[i]<-num[i-1]+((S[i-1])^2)*(T[i]-T[i-1])
}
den <- matrix(ncol=1,nrow=K)
den[1] <- 1*T[1]
if (K>=2){
for (i in 2:K)
den[i]<-den[i-1]+S[i-1]*(T[i]-T[i-1])
}
G <- 1-(num/den)
indices <- sum(1*(T<Tmax))
lastpiecesnum <- ((S[indices])^2)*(Tmax-T[indices])
lastpiecesden <- (S[indices])*(Tmax-T[indices])
GTmax <- 1-(num[indices]+lastpiecesnum)/(den[indices]+lastpiecesden)
return(list( G=G,info=info,GTmax=GTmax))
}
X <- datasetOrig[,1]
delta <- datasetOrig[,2]
Tx <- datasetOrig[,3]
dataset1Orig <- datasetOrig[Tx==1,]
dataset2Orig <- datasetOrig[Tx==2,]
N<-length(datasetOrig[,1])
N1<-length(dataset1Orig[,1])
N2<-length(dataset2Orig[,1])
if (lastEvent==1) A1<-dataset1Orig[,1]*1*(dataset1Orig[,2]==1) else A1<-dataset1Orig[,1]
if (lastEvent==1) A2<-dataset2Orig[,1]*1*(dataset2Orig[,2]==1) else A2<-dataset2Orig[,1]
Tmax1<-max(A1)+0.001
Tmax2<-max(A2)+0.001
Tmaxnum<-Tmax1*1*(N1>=N2)+Tmax2*1*(N2>N1)
Ginisurv1 <- Gcensor2(dataset1Orig[,-3], Tmax=Tmaxnum)
Ginisurv2 <- Gcensor2(dataset2Orig[,-3], Tmax=Tmaxnum)
Ginis1Orig<- Ginisurv1$GTmax
Ginis2Orig<- Ginisurv2$GTmax
teststatGiniOrig<-(Ginis1Orig - Ginis2Orig)^2
if (asymptotic==TRUE){
VarasintGini1 <- VarGinicensor(dataset1Orig[,-3], Tmax1)
VarasintGini2 <- VarGinicensor(dataset2Orig[,-3], Tmax2)
teststatGiniAs<-(Ginis1Orig - Ginis2Orig)/sqrt(VarasintGini1 + VarasintGini2)
pGiniAs<-1-pchisq((teststatGiniAs^2), df=1)
}
teststatGiniPerm<-rep(NA, M)
if (permutation==TRUE){
for(msims in 1:M){
label<-sample(Tx)
dataset1<-matrix(ncol=2, nrow=N1)
dataset2<-matrix(ncol=2, nrow=N2)
dataset1[,1]<-datasetOrig[label==1,1]
dataset1[,2]<-datasetOrig[label==1,2]
dataset2[,1]<-datasetOrig[label==2,1]
dataset2[,2]<-datasetOrig[label==2,2]
if (lastEvent==1) A1Perm<-dataset1[,1]*1*(dataset1[,2]==1) else A1Perm<-dataset1[,1]
if (lastEvent==1) A2Perm<-dataset2[,1]*1*(dataset2[,2]==1) else A2Perm<-dataset2[,1]
Tmax1Perm<-max(A1Perm)+0.001
Tmax2Perm<-max(A2Perm)+0.001
TmaxnumPerm<-Tmax1Perm*1*(N1>=N2)+Tmax2Perm*1*(N2>N1)
Ginisurv1Perm <- Gcensor2(dataset1, Tmax=TmaxnumPerm)
Ginisurv2Perm <- Gcensor2(dataset2, Tmax=TmaxnumPerm)
Ginis1Perm<- Ginisurv1Perm$GTmax
Ginis2Perm<- Ginisurv2Perm$GTmax
teststatGiniPerm[msims]<-(Ginis1Perm-Ginis2Perm)^2
}
pGiniPerm<- sum( 1*(teststatGiniPerm > teststatGiniOrig) )/M
}
if (linearRank==TRUE){
datatemp <- list(X=X,delta=delta, Tx=Tx)
teststatGT <- survdiff(Surv(X,delta)~Tx,datatemp,rho=-1)$chisq
teststatLR <- survdiff(Surv(X,delta)~Tx,datatemp,rho=0)$chisq
teststatW <- survdiff(Surv(X,delta)~Tx,datatemp,rho=1)$chisq
pLR<-1-pchisq(teststatLR, df=1)
pGT<-1-pchisq(teststatGT, df=1)
pW<-1-pchisq(teststatW, df=1)
}
if (permutation==TRUE & asymptotic==TRUE & linearRank==TRUE)
print(c(pGiniAs=pGiniAs, pGiniPerm=pGiniPerm, pGT=pGT,pLR=pLR,pW=pW), digits=5)
else{
if (permutation==FALSE & asymptotic==TRUE & linearRank==TRUE)
print(c(pGiniAs=pGiniAs, pGT=pGT,pLR=pLR,pW=pW), digits=5)
if (permutation==TRUE & asymptotic==FALSE & linearRank==TRUE)
print(c(pGiniPerm=pGiniPerm, pGT=pGT,pLR=pLR,pW=pW), digits=5)
if (permutation==TRUE & asymptotic==TRUE & linearRank==FALSE)
print(c(pGiniAs=pGiniAs, pGiniPerm=pGiniPerm), digits=5)
if (permutation==FALSE & asymptotic==FALSE & linearRank==TRUE)
print(c(pGT=pGT,pLR=pLR,pW=pW), digits=5)
if (permutation==FALSE & asymptotic==TRUE & linearRank==FALSE)
print(c(pGiniAs=pGiniAs), digits=5)
if (permutation==TRUE & asymptotic==FALSE & linearRank==FALSE)
print(c(pGiniPerm=pGiniPerm ), digits=5)
if (permutation==FALSE & asymptotic==FALSE & linearRank==FALSE)
print("No test chosen", digits=5)
}
}
|
Loading required package: survival
pGiniAs pGiniPerm pGT pLR pW
0.125881 0.216000 0.040970 0.065339 0.095491
function (asymptotic = TRUE, permutation = TRUE, M = 500, linearRank = TRUE,
lastEvent = 1, datasetOrig)
{
library(survival)
VarGinicensor <- function(data, Tmax) {
data1 <- data.frame(data)
info <- survfit(Surv(data1[, 1], data1[, 2]) ~ 1, type = "kaplan-meier",
data = data1)
S <- matrix(info$surv)
T <- matrix(info$time)
indices <- sum(1 * (T <= Tmax))
if (indices == 0) {
var <- 0
return(var = var)
}
else {
Vt <- matrix(ncol = 1, nrow = indices)
Vt[1] <- 1 * T[1]
if (indices >= 2) {
for (i in 2:indices) {
Vt[i] <- Vt[i - 1] + ((S[i - 1])^2) * (T[i] -
T[i - 1])
}
}
lastpiecesVt <- ((S[indices])^2) * (Tmax - T[indices])
VtMax <- Vt[indices] + lastpiecesVt
Wt <- matrix(ncol = 1, nrow = indices)
Wt[1] <- 1 * T[1]
if (indices >= 2) {
for (i in 2:indices) {
Wt[i] <- Wt[i - 1] + S[i - 1] * (T[i] - T[i -
1])
}
}
lastpiecesWt <- (S[indices]) * (Tmax - T[indices])
WtMax <- Wt[indices] + lastpiecesWt
mu2 <- matrix(ncol = 1, nrow = indices)
mu2 <- VtMax - Vt
mu <- matrix(ncol = 1, nrow = indices)
mu <- WtMax - Wt
n <- length(data[, 1])
event <- matrix(ncol = 1, nrow = indices)
atrisk <- matrix(ncol = 1, nrow = indices)
for (i in 1:indices) {
event[i] <- info$n.event[i]
atrisk[i] <- info$n.risk[i]
}
dsigma <- matrix(0, ncol = 1, nrow = indices)
for (i in 1:indices) {
if (atrisk[i] > 0)
dsigma[i] <- (n * event[i])/(atrisk[i]^2)
}
varistant <- matrix(ncol = 1, nrow = indices)
for (i in 1:indices) {
varistant[i] <- (4 * exp(2 * log(mu2[i]) - 2 *
log(WtMax)) + exp(2 * log(mu[i]) + 2 * log(VtMax) -
4 * log(WtMax)) - 4 * exp(log(mu[i]) + log(mu2[i]) +
log(VtMax) - 3 * log(WtMax))) * (dsigma[i])
}
var <- matrix(ncol = 1, nrow = indices)
var[1] <- varistant[1]
if (indices >= 2) {
for (i in 2:indices) {
var[i] <- var[i - 1] + varistant[i]
}
}
return(var[indices]/n)
}
}
Gcensor2 <- function(data, Tmax = max(data[, 1])) {
data1 <- data.frame(data)
info <- survfit(Surv(data[, 1], data[, 2]) ~ 1, type = "kaplan-meier",
data = data1)
K <- length(info$time)
S <- matrix(info$surv, ncol = 1, nrow = K)
num <- matrix(ncol = 1, nrow = K)
T <- matrix(info$time, ncol = 1, nrow = K)
num[1] <- 1 * T[1]
if (K >= 2) {
for (i in 2:K) num[i] <- num[i - 1] + ((S[i - 1])^2) *
(T[i] - T[i - 1])
}
den <- matrix(ncol = 1, nrow = K)
den[1] <- 1 * T[1]
if (K >= 2) {
for (i in 2:K) den[i] <- den[i - 1] + S[i - 1] *
(T[i] - T[i - 1])
}
G <- 1 - (num/den)
indices <- sum(1 * (T < Tmax))
lastpiecesnum <- ((S[indices])^2) * (Tmax - T[indices])
lastpiecesden <- (S[indices]) * (Tmax - T[indices])
GTmax <- 1 - (num[indices] + lastpiecesnum)/(den[indices] +
lastpiecesden)
return(list(G = G, info = info, GTmax = GTmax))
}
X <- datasetOrig[, 1]
delta <- datasetOrig[, 2]
Tx <- datasetOrig[, 3]
dataset1Orig <- datasetOrig[Tx == 1, ]
dataset2Orig <- datasetOrig[Tx == 2, ]
N <- length(datasetOrig[, 1])
N1 <- length(dataset1Orig[, 1])
N2 <- length(dataset2Orig[, 1])
if (lastEvent == 1)
A1 <- dataset1Orig[, 1] * 1 * (dataset1Orig[, 2] == 1)
else A1 <- dataset1Orig[, 1]
if (lastEvent == 1)
A2 <- dataset2Orig[, 1] * 1 * (dataset2Orig[, 2] == 1)
else A2 <- dataset2Orig[, 1]
Tmax1 <- max(A1) + 0.001
Tmax2 <- max(A2) + 0.001
Tmaxnum <- Tmax1 * 1 * (N1 >= N2) + Tmax2 * 1 * (N2 > N1)
Ginisurv1 <- Gcensor2(dataset1Orig[, -3], Tmax = Tmaxnum)
Ginisurv2 <- Gcensor2(dataset2Orig[, -3], Tmax = Tmaxnum)
Ginis1Orig <- Ginisurv1$GTmax
Ginis2Orig <- Ginisurv2$GTmax
teststatGiniOrig <- (Ginis1Orig - Ginis2Orig)^2
if (asymptotic == TRUE) {
VarasintGini1 <- VarGinicensor(dataset1Orig[, -3], Tmax1)
VarasintGini2 <- VarGinicensor(dataset2Orig[, -3], Tmax2)
teststatGiniAs <- (Ginis1Orig - Ginis2Orig)/sqrt(VarasintGini1 +
VarasintGini2)
pGiniAs <- 1 - pchisq((teststatGiniAs^2), df = 1)
}
teststatGiniPerm <- rep(NA, M)
if (permutation == TRUE) {
for (msims in 1:M) {
label <- sample(Tx)
dataset1 <- matrix(ncol = 2, nrow = N1)
dataset2 <- matrix(ncol = 2, nrow = N2)
dataset1[, 1] <- datasetOrig[label == 1, 1]
dataset1[, 2] <- datasetOrig[label == 1, 2]
dataset2[, 1] <- datasetOrig[label == 2, 1]
dataset2[, 2] <- datasetOrig[label == 2, 2]
if (lastEvent == 1)
A1Perm <- dataset1[, 1] * 1 * (dataset1[, 2] ==
1)
else A1Perm <- dataset1[, 1]
if (lastEvent == 1)
A2Perm <- dataset2[, 1] * 1 * (dataset2[, 2] ==
1)
else A2Perm <- dataset2[, 1]
Tmax1Perm <- max(A1Perm) + 0.001
Tmax2Perm <- max(A2Perm) + 0.001
TmaxnumPerm <- Tmax1Perm * 1 * (N1 >= N2) + Tmax2Perm *
1 * (N2 > N1)
Ginisurv1Perm <- Gcensor2(dataset1, Tmax = TmaxnumPerm)
Ginisurv2Perm <- Gcensor2(dataset2, Tmax = TmaxnumPerm)
Ginis1Perm <- Ginisurv1Perm$GTmax
Ginis2Perm <- Ginisurv2Perm$GTmax
teststatGiniPerm[msims] <- (Ginis1Perm - Ginis2Perm)^2
}
pGiniPerm <- sum(1 * (teststatGiniPerm > teststatGiniOrig))/M
}
if (linearRank == TRUE) {
datatemp <- list(X = X, delta = delta, Tx = Tx)
teststatGT <- survdiff(Surv(X, delta) ~ Tx, datatemp,
rho = -1)$chisq
teststatLR <- survdiff(Surv(X, delta) ~ Tx, datatemp,
rho = 0)$chisq
teststatW <- survdiff(Surv(X, delta) ~ Tx, datatemp,
rho = 1)$chisq
pLR <- 1 - pchisq(teststatLR, df = 1)
pGT <- 1 - pchisq(teststatGT, df = 1)
pW <- 1 - pchisq(teststatW, df = 1)
}
if (permutation == TRUE & asymptotic == TRUE & linearRank ==
TRUE)
print(c(pGiniAs = pGiniAs, pGiniPerm = pGiniPerm, pGT = pGT,
pLR = pLR, pW = pW), digits = 5)
else {
if (permutation == FALSE & asymptotic == TRUE & linearRank ==
TRUE)
print(c(pGiniAs = pGiniAs, pGT = pGT, pLR = pLR,
pW = pW), digits = 5)
if (permutation == TRUE & asymptotic == FALSE & linearRank ==
TRUE)
print(c(pGiniPerm = pGiniPerm, pGT = pGT, pLR = pLR,
pW = pW), digits = 5)
if (permutation == TRUE & asymptotic == TRUE & linearRank ==
FALSE)
print(c(pGiniAs = pGiniAs, pGiniPerm = pGiniPerm),
digits = 5)
if (permutation == FALSE & asymptotic == FALSE & linearRank ==
TRUE)
print(c(pGT = pGT, pLR = pLR, pW = pW), digits = 5)
if (permutation == FALSE & asymptotic == TRUE & linearRank ==
FALSE)
print(c(pGiniAs = pGiniAs), digits = 5)
if (permutation == TRUE & asymptotic == FALSE & linearRank ==
FALSE)
print(c(pGiniPerm = pGiniPerm), digits = 5)
if (permutation == FALSE & asymptotic == FALSE & linearRank ==
FALSE)
print("No test chosen", digits = 5)
}
}
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.