pseudoIC: Fast pseudo values for the survival function or the RMST for...

Description Usage Arguments Details Value See Also Examples

View source: R/pseudoIC.R

Description

Compute pseudo values for the survival function or the Restricted Mean Survival Time (RMST) based on the piecewise-constant hazard (PCH) model. A fast approximation is implemented that provides asymptotic pseudo-values based on the score information and Hessian matrix instead of computing the jackknife method.

Usage

1
pseudoIC(fit, Left, Right, tseq = NULL, tau = NULL)

Arguments

fit

an object from the class survIC. This object is obtained by implementing the maximum likelihood estimator in the PCH model for interval-censored data through the function mleIC.

Left

the time sequence for the left-time endpoint. Must be non-negative.

Right

the time sequence for the right-time endpoint. Each component must be greater than the corresponding component of Left. The value Inf for right-censored data is allowed.

tseq

is used only for the implementation of the pseudo-values of the survival function (in which case tau=NULL). The survival function is evaluated at the times given in tseq and its pseudo-values are computed.

tau

is equals NULL then the program implements the pseudo-values for the survival function. Otherwise, tau must be a positive time point in which case the RMST with right endpoint tau is implemented.

Details

A fast approximation formula is used to compute pseudo-values for interval-censored data based on the pch model. For more information on the pch model, see the help for the function mleIC. The pseudo-values are computed for the survival function if tau is equal to NULL. In that case, the function returns an approximation of

n\hat S(t)-(n-1)\hat S^{(-i)}(t),

where \hat S is the survival estimator from the pch model and \hat S^{(-i)} is the same estimator computed when the ith observation has been removed. Those pseudo-values are computed for all i=1,...,n and for all time points t in tseq. They are computed for the RMST if tau is specified. In that case, the function returns an approximation of

n\int_0^{τ}\hat S(t)dt-(n-1)\int_0^{τ}\hat S^{(-i)}(t)dt.

A model based on Generalised Estimating Equation can be further specified through the geese function in the geepack package (see Examples below).

Regularity conditions are imposed from maximum likelihood theory. Those conditions may not be verified if the cuts are not adequately chosen in the fit object and this will be problematic for the computation of the pseudo-values. In particular, the average of the pseudo-values might not return the value of the initial estimator, indicating that the pseudo-values are incorrect. We therefore recommend to always carefully check if the model in fit implemented from the mleIC function has properly converged. This can easily be verified by using the option verbose=TRUE in the mleIC function. If the mleIC function returned an error or if the estimate obtained from the mleIC function contains some α_k's with the value 0, then the pseudoIC function will return an error. We then recommend to simply change the values of the cuts. See mleIC for more details about those regularity conditions.

Value

if tau=NULL, returns the pseudo-values of the survival function evaluated at tseq. The output is a matrix with number of rows equal to the sample size and number of columns equal to the length of tseq.

if tau contains a single positive value then the function returns the RMST with right endpoint equal to tau. In case tseq and tau are both provided, the RMST is implemented and the value of tseq is ignored.

See Also

mleIC, RmstIC, rsurv, pseudoKM.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
#Simulations without covariates
n=4000
cuts=c(20,40,50,70)
alpha=c(0.05,0.05,0.1,0.2,0.4)/10
TrueTime=rsurv(n,cuts,alpha) #generate true data from the pch model
##Simulation of interval-censored data
Right<-rep(Inf,n)
nb.visit=20
visTime=0;visit=matrix(0,n,nb.visit+1)
visit=cbind(visit,rep(Inf,n))
visit[,2]=visit[,1]+stats::runif(n,0,10)#runif(n,0,5)
schedule=4
for (i in 3:(nb.visit+1))
{
  visit[,i]=visit[,i-1]+stats::runif(n,0,schedule*2)
}
Left<-visit[,(nb.visit+1)]
J=sapply(1:(n),function(i)cut(TrueTime[i],breaks=c(visit[1:(n),][i,]),
labels=1:(nb.visit+1),right=FALSE)) #sum(is.na(J)) check!
Left[1:(n)]=sapply(1:(n),function(i)visit[1:(n),][i,J[i]])
Right[1:(n)]=sapply(1:(n),function(i)visit[1:(n),][i,as.numeric(J[i])+1])
#View(data.frame(Left,Right,TrueTime)) #To see the generated data
sum(Right[1:(n)]==Inf)/n #percentage of right-censored data

result=mleIC(Left,Right,cuts=cuts,a=rep(log(0.5),length(cuts)+1),
maxiter=1000,tol=1e-12,verbose=TRUE)
result$lambda #the estimation
alpha #true value of hazard
#Plot the true survival function and compare with pseudo-values
plot(result,surv=TRUE)
tseq=seq(30,90,length.out=10)
pseudo_val=pseudoIC(result,Left,Right,tseq=tseq,tau=NULL)
lines(tseq,apply(pseudo_val,2,mean),type="p",col="red")

#Estimated value of RMST for tau=30
RmstIC(cuts,result$lambda,tau=30)
pseudoval=pseudoIC(result,Left,Right,tau=30) #pseudo-values
mean(pseudoval)
#Estimated value of RMST for tau=45
RmstIC(cuts,result$lambda,tau=45)
pseudoval=pseudoIC(result,Left,Right,tau=45) #pseudo-values
mean(pseudoval) #is close to the estimated RMST
#Estimated value of RMST for tau=50
RmstIC(cuts,result$lambda,tau=50)
pseudoval=pseudoIC(result,Left,Right,tau=50) #pseudo-values
mean(pseudoval) #is close to the estimated RMST

#Simulations with covariates
n=1000
tau=3000
X=runif(n,0,2)
epsi=rnorm(n,0,1)
theta0=6;theta1=4
TrueTime=theta0+theta1*X+epsi #min(TrueTime) must be positive!
##Simulation of interval-censored data
Right<-rep(Inf,n)
nb.visit=5#nb.visit=10
visTime=0;visit=matrix(0,n,nb.visit+1)
visit=cbind(visit,rep(Inf,n))
visit[,2]=visit[,1]+runif(n,0,10)#runif(n,0,5)
schedule=2
for (i in 3:(nb.visit+1))
{
  visit[,i]=visit[,i-1]+runif(n,0,schedule*2)
}
Left<-visit[,(nb.visit+1)]
J=sapply(1:(n),function(i)cut(TrueTime[i],breaks=c(visit[1:(n),][i,]),
labels=1:(nb.visit+1),right=FALSE)) #sum(is.na(J)) check!
Left[1:(n)]=sapply(1:(n),function(i)visit[1:(n),][i,J[i]])
Right[1:(n)]=sapply(1:(n),function(i)visit[1:(n),][i,as.numeric(J[i])+1])

cuts=c(6,8,10,12,14)
result=mleIC(Left,Right,cuts=cuts,a=rep(log(0.5),length(cuts)+1),
maxiter=1000,tol=1e-12,verbose=TRUE)
pseudoval=pseudoIC(result,Left,Right,tau=tau)
require(geepack)
data_pseudo<-data.frame(Y=c(pseudoval),X=X,id=rep(1:n))
resultEst=geese(Y~X,id=id,jack=TRUE,family="gaussian", mean.link="identity",
corstr="independence",scale.fix=TRUE,data=data_pseudo)
summary(resultEst)

obouaziz/FastPseudo documentation built on Dec. 22, 2021, 4:12 a.m.