penPHcure: Variable selection in PH cure model with time-varying...

Description Usage Arguments Details Value References See Also Examples

View source: R/penPHcure.R

Description

This function allows to fit a PH cure model with time varying covariates, to compute confidence intervals for the estimated regression coefficients or to make variable selection through a LASSO/SCAD-penalized model.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
penPHcure(
  formula,
  cureform,
  data,
  X = NULL,
  maxIterNR = 500,
  maxIterEM = 500,
  tol = 1e-06,
  standardize = TRUE,
  ties = c("efron", "breslow"),
  SV = NULL,
  which.X = c("last", "mean"),
  inference = FALSE,
  nboot = 100,
  constraint = TRUE,
  pen.type = c("none", "SCAD", "LASSO"),
  pen.weights = NULL,
  pen.tuneGrid = NULL,
  epsilon = 1e-08,
  pen.thres.zero = 1e-06,
  print.details = TRUE,
  warnings = FALSE
)

Arguments

formula

a formula object, with the response on the left of a ~ operator and the variables to be included in the latency (survival) component on the right. The response must be a survival object returned by the Surv(time,time2,status) function.

cureform

a one-sided formula object of the form ~ x1 + x2 + ... with the covariates to be included in the incidence (cure) component.

data

a data.frame (in a counting process format) in which to interpret the variables named in the formula and cureform arguments.

X

a matrix of time-invariant covariates to be included in the incidence (cure) component. If the user provide such matrix, the arguments cureform and which.X will be ignored. By default, X = NULL.

maxIterNR

a positive integer: the maximum number of iterations to attempt for convergence of the Newton-Raphson (NR) algorithm (Cox's and logistic regression model). By default maxIterNR = 500.

maxIterEM

a positive integer: the maximum number of iterations to attempt for convergence of the Expectation-Maximization (EM) algorithm. By default maxIterEM = 500.

tol

a positive numeric value used to determine convergence of the NR and EM algorithms. By default, tol = 1e-6.

standardize

a logical value. If TRUE, the values of the covariates are standardized (centered and scaled), such that their mean and variance will be equal to 0 and 1, respectively. By default, standardize = TRUE.

ties

a character string used to specify the method for handling ties: either "efron" or "breslow". By default, ties = "efron".

SV

a list with elements b and beta, numeric vectors of starting values for the regression coefficients in the incidence (cure) and latency (survival) component, respectively. By default SV = NULL.

which.X

character string used to specify the method used to transform the covariates included in the incidence (cure) component from time-varying to time-invariant. There are two options: either take the last observation ("last") or the mean over the full history of the covariates ("mean"). By default, which.X = "last".

inference

a logical value. If TRUE and pen.type == "none", confidence intervals for the regression coefficient estimates are computed using the basic/percentile bootstrap method. By default inference = FALSE.

nboot

a positive integer: the number of bootstrap resamples for the construction of the confidence intervals (used only when inference = TRUE). By default, nboot = 100.

constraint

a logical value. If TRUE, the model makes use of the zero-tail constraint, classifying the individuals with censoring times grater than the largest event time as non-susceptible. For more details, see \insertCiteSy_Taylor_2000;textualpenPHcure. By default constraint = TRUE.

pen.type

a character string used to specify the type of penalty used to make variable selection: either "none", "SCAD" or "LASSO". By default, pen.type="none", only a standard model is fitted without performing variable selection.

pen.weights

a list with elements named CURE and SURV, positive numeric vectors of penalty weights for the covariates in the incidence (cure) and latency (survival) component, respectively. By default, all weights are set equal to 1, except for the intercept in the incidence (cure) component (always equal to 0).

pen.tuneGrid

a list with elements named CURE and SURV, named lists of tuning parameter vectors. If pen.type == "SCAD" they should contain two numeric vectors of possible tuning parameters: lambda and a. Whereas, if pen.type == "LASSO", only one vector lambda. By default lambda = exp(seq(-7,0,length.out = 10)) and a = 3.7.

epsilon

a positive numeric value used as a perturbation of the penalty function. By default, epsilon = 1e-08.

pen.thres.zero

a positive numeric value used as a threshold. After fitting the penalized PH cure model, the estimated regression coefficients with an absolute value lower than this threshold are set equal to zero. By default, pen.thres.zero = 1e-06.

print.details

a logical value. If TRUE, tracing information on the progress of the routines is produced. By default print.details = TRUE.

warnings

a logical value. If TRUE, possible warnings from the NR and EM algorithms are produced. By default warnings = FALSE.

Details

When the starting values (SV) are not specified and pen.type == "none":

Whereas, if pen.type == "SCAD" | "LASSO", both vectors are filled with zeros.

When performing variable selection (pen.type == "SCAD" | "LASSO"), a penalized PH cure model is fitted for each possible combination of the tuning parameters in pen.tuneGrid. Two models are selected on the basis of the Akaike and Bayesian Information Criteria:

AIC=-ln(\hat{L})+2df,

BIC=-ln(\hat{L})+ln(n)df,

where ln(\hat{L}) is the value of the log-likelihood at the penalized MLEs, df is the value of the degrees of freedom (number of non-zero coefficients) and n is the sample size.

Regarding the possible tuning parameters in pen.tuneGrid, the numeric vectors lambda and a should contain values >= 0 and > 2, respectively.

Value

If the argument pen.type = "none", this function returns a PHcure.object. Otherwise, if pen.type == "SCAD" | "LASSO", it returns a penPHcure.object.

References

\insertRef

Beretta_Heuchenne_2019penPHcure

\insertRef

Sy_Taylor_2000penPHcure

See Also

penPHcure-package, PHcure.object, penPHcure.object

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
# Generate some data (for more details type ?penPHcure.simulate in your console)
data <- penPHcure.simulate()

### Standard PH cure model

# Fit standard cure model (without inference)
fit <- penPHcure(Surv(time = tstart,time2 = tstop,
                      event = status) ~ z.1 + z.2 + z.3 + z.4,
                 cureform = ~ x.1 + x.2 + x.3 + x.4,data = data)
# The returned PHcure.object has methods summary and predict, 
#  for more details type ?summary.PHcure or ?predict.PHcure in your console.

# Fit standard cure model (with inference)
fit2 <- penPHcure(Surv(time = tstart,time2 = tstop,
                       event = status) ~ z.1 + z.2 + z.3 + z.4,
                  cureform = ~ x.1 + x.2 + x.3 + x.4,data = data,
                  inference = TRUE)
# The returned PHcure.object has methods summary and predict, 
#  for more details type ?summary.PHcure or ?predict.PHcure in your console.


### Tune penalized cure model with SCAD penalties

# First define the grid of possible values for the tuning parameters.
pen.tuneGrid <- list(CURE = list(lambda = exp(seq(-7,-2,length.out = 10)),
                                 a = 3.7),
                     SURV = list(lambda = exp(seq(-7,-2,length.out = 10)),
                                 a = 3.7))
# Tune the penalty parameters.
tuneSCAD <- penPHcure(Surv(time = tstart,time2 = tstop,
                           event = status) ~ z.1 + z.2 + z.3 + z.4,
                      cureform = ~ x.1 + x.2 + x.3 + x.4,
                      data = data,pen.type = "SCAD",
                      pen.tuneGrid = pen.tuneGrid)
# The returned penPHcure.object has methods summary and predict, for more
#  details type ?summary.penPHcure or ?predict.penPHcure in your console.

### Tune penalized cure model with LASSO penalties

# First define the grid of possible values for the tuning parameters.
pen.tuneGrid <- list(CURE = list(lambda = exp(seq(-7,-2,length.out = 10))),
                     SURV = list(lambda = exp(seq(-7,-2,length.out = 10))))
# Tune the penalty parameters.
tuneLASSO <- penPHcure(Surv(time = tstart,time2 = tstop,
                            event = status) ~ z.1 + z.2 + z.3 + z.4,
                       cureform = ~ x.1 + x.2 + x.3 + x.4,
                       data = data,pen.type = "LASSO",
                       pen.tuneGrid = pen.tuneGrid)
# The returned penPHcure.object has methods summary and predict, for more
#  details type ?summary.penPHcure or ?predict.penPHcure in your console.

a-beretta/penPHcure documentation built on Dec. 3, 2019, 5:41 p.m.