mleIC: Computation of the maximum likelihood estimator in the PCH...

Description Usage Arguments Details Value See Also Examples

View source: R/mleIC.R

Description

Given a sequence of time, a set of cuts and initial log-hazard values, compute the maximum likelihood estimator (MLE) of the hazard function from a piecewise-constant hazard model for interval-censored data. The estimator is based on the EM algorithm.

Usage

1
2
3
4
5
6
7
8
9
mleIC(
  Left,
  Right,
  cuts = NULL,
  a = rep(log(0.5), length(cuts) + 1),
  maxiter = 1000,
  tol = 1e-12,
  verbose = FALSE
)

Arguments

Left

the time sequence for the left-time endpoint. Must be non-negative.

Right

the time sequence for the right-time endpoint. Each component must be greater than the corresponding component of Left. The value Inf for right-censored data is allowed.

cuts

the sequence of cuts in the pch model. Default value equals NULL which corresponds to the exponential model.

a

the initial value of the log-hazard function between each cut. Must be of length equal to length(cuts)+1. The default value is log(0.5) for all segments.

maxiter

the total maximum number of iterations in the EM algorithm. Default is 1000.

tol

the tolerance value in the EM algorithm. The algorithm stops when the relative error between new and previous log-hazard update value is less than tol. Default is 1e-12.

verbose

if TRUE, at each iteration step the current value of the estimator is displayed along with the relative error between new and previous log-hazard update value in the EM algorithm.

Details

Returns the MLE for the piecewise constant hazard model defined in the following way:

λ(t)=∑ 1(c_{k-1}< t≤ c_k) exp(a_k),

where the sum is over k ranging from 1 to K, with K equals to the length of cuts plus one, c_0=0, c_K equals infinity and the c_k's are provided by cuts. The algorithm estimates the a_k's using the EM algorithm. The a_k's are initialized by the value given in a. The EM algorithm is not sensitive to the initialization and we recommend to simply use the default value which is equal to log(0.5) for each a_k.

The algorithm deals with interval-censored data supplied by the Left and Right vectors. It is possible to include right-censored data by specifying Right=Inf and to have exact observations by specifying the same value to Left and Right.

Two regularity conditions arise from maximum likelihood theory. The first one states that the probability that the observed intervals (for which the right endpoint is not infinite) intersect a cut should be positive. If that is not the case, the algorithm will still be stable, returning the hazard value 0 for the corresponding cut. The second condition states that the probability that a left endpoint is greater than the left value of a cut should be positive. If there are no observations verifying this condition then the algorithm will diverge and the hazard value for this cut will increase at each iteration of the EM algorithm. This will make the algorithm return an error. More generally, one should carefully check the convergence of the algorithm by setting verbose to TRUE.

The function mle.ic returns an object from the survIC class. For such class, there are a plot and lines options. When applying plot (or lines) to a survIC class object, one can choose the option surv=FALSE (the default) or surv=TRUE. In the former case, the hazard function will be plotted and in the latter case the survival function will be displayed. One can set xlim to specific values in the plot arguments (see the Examples section for more information).

Value

a the estimated value of the log-hazard (the a_k's)
lambda the estimated value of the hazard (the exp(a_k)'s)
cuts the cuts used in the PCH model as initially supplied in the mleIC function
itermax the number of iterations used in the EM algorithm to reach convergence
test the relative error that has been reached at convergence of the algorithm

See Also

pchcumhaz, rsurv, pseudoIC, RmstIC.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
n=4000
cuts=c(20,40,50,70)
alpha=c(0,0.05,0.1,0.2,0.4)/10
TrueTime=rsurv(n,cuts,alpha) #generate true data from the pch model
##Simulation of interval-censored data
Right<-rep(Inf,n)
nb.visit=20
visTime=0;visit=matrix(0,n,nb.visit+1)
visit=cbind(visit,rep(Inf,n))
visit[,2]=visit[,1]+stats::runif(n,0,10)#runif(n,0,5)
schedule=4
for (i in 3:(nb.visit+1))
{
  visit[,i]=visit[,i-1]+stats::runif(n,0,schedule*2)
}
Left<-visit[,(nb.visit+1)]
J=sapply(1:(n),function(i)cut(TrueTime[i],breaks=c(visit[1:(n),][i,]),
labels=1:(nb.visit+1),right=FALSE)) #sum(is.na(J)) check!
Left[1:(n)]=sapply(1:(n),function(i)visit[1:(n),][i,J[i]])
Right[1:(n)]=sapply(1:(n),function(i)visit[1:(n),][i,as.numeric(J[i])+1])
#View(data.frame(Left,Right,TrueTime)) #To see the generated data
sum(Right[1:(n)]==Inf)/n #percentage of right-censored data
result=mleIC(Left,Right,cuts=cuts,a=rep(log(0.5),length(cuts)+1),
maxiter=1000,tol=1e-12,verbose=TRUE)
result$lambda #the estimation
alpha #true value

plot(result)
lines(c(0,cuts,100),c(alpha,alpha[5]),type="s",col="red") #true hazard
seqtime=seq(0,200,length.out=1000)
plot(result,xlim=c(0,200),surv=TRUE)
lines(seqtime,exp(-pchcumhaz(seqtime,cuts,alpha)),type="l",col="red") #true survival function
resultbis=mleIC(Left,Right,cuts=NULL,a=rep(log(0.5),1),
maxiter=1000,tol=1e-12,verbose=TRUE)
lines(resultbis,xlim=c(0,200),col="blue",surv=TRUE)

##Example with exact values
Right[1:100]<-TrueTime[1:100]
Left[1:100]<-TrueTime[1:100]
result=mleIC(Left,Right,cuts=cuts,a=rep(log(0.5),length(cuts)+1),
maxiter=1000,tol=1e-12,verbose=TRUE)
alpha #true value

obouaziz/FastPseudo documentation built on Dec. 22, 2021, 4:12 a.m.