TNPMLE: Non-Parametric Maximum-Likelihood Estimation for Cohort...
In NPMLENCC: Non-Parametric Maximum Likelihood Estimate for Cohort Samplings

Description Usage Arguments Value Note References See Also Examples

The function utilizes a self-consistency iterative algorithm to calculate NPMLEs for cohort samplings with time matching under Cox's regression model. In addition to compute NPMLEs, it can also estimate asymptotic varance, as described in Wang et al. (2019). The Cox's regression model is

λ(t|z)=λ_{0}(t)\exp(z^Tβ).

1	TNPMLE(data, iteration1, iteration2, converge)

`data`	The N \times P matrix of data. There are N individuals in matrix, with one individual in each row. The P columns orderly included the observable times which are time-to-event or censoring times and without ties at the event times, the status is a binary variable with 1 indicating the event has occured and 0 indicating (right) censoring, and the (P-2) covariates which only observed for some individuals. Note that the covariate of those unobserved individuals are denoted by -9, not missing value (NA) and the observed covariates values are not the same as -9.
`iteration1`	The number of iteration for computing NPMLEs.
`iteration2`	The number of iteration for computing profile likelihoods which are used to estimate asymptotic variance.
`converge`	The parameter influence the convergence of the algorithm, if the sup-norm of ≤ft(\hat{β}_{(k)}-\hat{β}_{(k-1)}\right) is smaller than the thresholding value, we then declare the estimates are converge, stop computing estimates, otherwise the number of iteration for computing estimates is the `iteration1` term.

Returns a list with components

`num`	The numbers of case and observed subjects.
`iloop`	The final number of iteration for computing NPMLEs.
`diff`	The sup-norm distance between the last two iterations of the estimates of the relative risk coefficients.
`likelihood`	The log likelihood value of NPMLEs.
`npmle`	The estimated regression coefficients with their corresponding estimated standard errors and p-values.
`Lnpmle`	The estimated cumulative baseline hazards function.
`Pnpmle`	The empirical distribution of covariates which are missing for unobserved subjects.
`elements`	A list which is used to plot cumulative baseline hazards function and baseline survival function. The n \times 3 matrix of data, n is the total number of case and the 3 columns orderly included the order observed time of case, the estimated cumulative baseline hazards function and estimated baseline survival function.
`Adata`	Arranging original data to let our analysis performed conveniently. There are three steps for this arrangement, the 1st step divides original data into observed and unobserved groups, then put them on top and bottom, respectively; the 2nd step divides the observed data of 1st step into case and control groups; the final step order the case data of 2nd step by observed time from low to high.

The missing value (NA) in the DATA is not allowed in this version.

Wang JH, Pan CH, Chang IS*, and Hsiung CA (2019) Penalized full likelihood approach to variable selection for Cox's regression model under nested case-control sampling. published in Lifetime Data Analysis <doi:10.1007/s10985-019-09475-z>.

See TPNPMLE.

set.seed(100)
library(splines)
library(survival)
library(MASS)
beta=c(1,0)
lambda=0.3
cohort=100
covariate=2+length(beta)
z=matrix(rnorm(cohort*length(beta)),nrow=cohort)
rate=1/(runif(cohort,1,3)*exp(z%*%beta))
c=rexp(cohort,rate)
u=-log(runif(cohort,0,1))/(lambda*exp(z%*%beta))
time=apply(cbind(u,c),1,min)
status=(u<=c)+0
casenum=sum(status)
odata=cbind(time,status,z)
odata=data.frame(odata)
a=order(status)
data=matrix(0,cohort,covariate)
data=data.frame(data)
for (i in 1:cohort){
data[i,]=odata[a[cohort-i+1],]
}
ncc=matrix(0,cohort,covariate)
ncc=data.frame(data)
aa=order(data[1:casenum,1])
for (i in 1:casenum){
ncc[i,]=data[aa[i],]
}
control=1
q=matrix(0,casenum,control)
for (i in 1:casenum){
k=c(1:cohort)
k=k[-(1:i)]
sumsc=sum(ncc[i,1]<ncc[,1][(i+1):cohort])
if (sumsc==0) {
			q[i,]=c(1)
} else {
			q[i,]=sample(k[ncc[i,1]<ncc[,1][(i+1):cohort]],control)
}
}
cacon=c(q,1:casenum)
k=c(1:cohort)
owf=k[-cacon]
wt=k[-owf]
owt=k[-wt]
ncct=matrix(0,cohort,covariate)
ncct=data.frame(ncct)
for (i in 1:length(wt)){
ncct[i,]=ncc[wt[i],]
}
for (i in 1:length(owt)){
ncct[length(wt)+i,]=ncc[owt[i],]
}
d=length(wt)+1
ncct[d:cohort,3:covariate]=-9
TNPMLEtest=TNPMLE(data=ncct,iteration1=100,iteration2=30,converge=0)