# TPNPMLE: Penalized Non-Parametric Maximum-Likelihood Estimation... In NPMLENCC: Non-Parametric Maximum Likelihood Estimate for Cohort Samplings

## Description

The function utilizes a self-consistency iterative algorithm to calculate PNPMLEs by adding penalty function for cohort samplings with time matching under Cox's regression model. In addition to compute PNPMLEs, it can also estimate asymptotic varance, as described in Wang et al. (2019+). The Cox's regression model is

λ(t|z)=λ_{0}(t)\exp(z^Tβ).

## Usage

 1 2 TPNPMLE(data, iteration1, iteration2, converge, penalty, penaltytuning, fold, cut, seed) 

## Arguments

 data The description is the same as the statement of TNPMLE function. iteration1 The number of iteration for computing (P)NPMLEs. iteration2 The number of iteration for computing profile likelihoods which are used to estimate asymptotic variance. converge The description is the same as the statement of TNPMLE function. penalty The choice of penalty, it can be SCAD, HARD or LASSO. penaltytuning The tuning parameter for penalty function, it is a sequence of numeric vector. fold The fold information for cross validation. Without loss of generality, we note that fold value have to be bigger than one (>1) and cohort size is divisible by fold value. However, if cohort size is not able to be divided, we are going to partition off cohort into several suitable parts according to fold value automaticly for cross-validation. cut The cut point. When \hat{β}_j is smaller than the cut point, we set \hat{β}_j be zero, i.e. remove the corresponding covariate from our model to do variable selection. seed The seed of the random number generator to obtain reproducible results.

## Value

Returns a list with components

 num The numbers of case and observed subjects. iloop The final number of iteration for computing PNPMLEs. diff The sup-norm distance between the last two iterations of the estimates of the relative risk coefficients. cvl The cross-validated profile log-likelihood. tuning The suitable tuning parameter, such that the maximum of cross-validated profile log-likelihood is attained. likelihood The log likelihood value of PNPMLEs. pnpmle The estimated regression coefficients with their corresponding estimated standard errors and p-values. Lpnpmle The estimated cumulative baseline hazards function. Ppnpmle The empirical distribution of covariates which are missing for unobserved subjects. elements The description is the same as the statement of TNPMLE function. Adata The description is the same as the statement of TNPMLE function.

## Note

The missing value (NA) in the DATA is not allowed in this version.

## References

Wang JH, Pan CH, Chang IS*, and Hsiung CA (2019) Penalized full likelihood approach to variable selection for Cox's regression model under nested case-control sampling. published in Lifetime Data Analysis <doi:10.1007/s10985-019-09475-z>.

See TNPMLE.
  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 `set.seed(100) library(splines) library(survival) library(MASS) beta=c(1,0) lambda=0.3 cohort=100 covariate=2+length(beta) z=matrix(rnorm(cohort*length(beta)),nrow=cohort) rate=1/(runif(cohort,1,3)*exp(z%*%beta)) c=rexp(cohort,rate) u=-log(runif(cohort,0,1))/(lambda*exp(z%*%beta)) time=apply(cbind(u,c),1,min) status=(u<=c)+0 casenum=sum(status) odata=cbind(time,status,z) odata=data.frame(odata) a=order(status) data=matrix(0,cohort,covariate) data=data.frame(data) for (i in 1:cohort){ data[i,]=odata[a[cohort-i+1],] } ncc=matrix(0,cohort,covariate) ncc=data.frame(data) aa=order(data[1:casenum,1]) for (i in 1:casenum){ ncc[i,]=data[aa[i],] } control=1 q=matrix(0,casenum,control) for (i in 1:casenum){ k=c(1:cohort) k=k[-(1:i)] sumsc=sum(ncc[i,1]