cv.coxlasso: Cross-validation for coxlasso

View source: R/cv.coxlasso.R

cv.coxlassoR Documentation

Cross-validation for coxlasso

Description

performs k-fold cross-validation for coxlasso, produces a plot, and returns a value for the LASSO tuning parameter \xi.

Usage

cv.coxlasso(fix, rnd = NULL, vary.coef = NULL, n.folds = 10, xi = NULL,
            data, adaptive.weights = NULL, print.fold = TRUE, print.xi = FALSE,
            len.xi = 100, lgrid = TRUE, ran.seed = 1909, xi.factor = 1.01, min.fold = 4,
            pass.on.start = TRUE, control = list(print.iter = FALSE))

Arguments

fix

a two-sided linear formula object describing the fixed (time-constant) effects part of the model, with the response on the left of a ~ operator and the terms, separated by + operators, on the right. The response must be a survival object as returned by the Surv function.

rnd

a two-sided linear formula object describing the random-effects part of the model, with the grouping factor on the left of a ~ operator and the random terms, separated by + operators, on the right. Default is NULL, so no random effects are present.

vary.coef

a one-sided linear formula object describing the time-varying effects part of the model, with the time-varying terms, separated by + operators, on the right side of a ~ operator. Default is NULL, so no time-varying effects are incorporated.

n.folds

number of folds. Default is 10.

xi

Optional user-supplied xi sequence; default is NULL, and cv.coxlasso chooses its own sequence

data

the data frame containing the variables named in the three preceding formula arguments.

adaptive.weights

for the LASSO-penalized fixed effects a vector of adaptive weights can be passed to the procedure. If no adaptive weights are specified, an unpenalized model (i.e. \xi=0) is fitted by the coxFL function and the obtained estimates are used as adaptive weights (see value section).

print.fold

Should folds of CV be printed? Default is yes.

print.xi

Should current \xi value be printed? Default is no.

len.xi

Length of \xi grid. Default is 100.

lgrid

Logical; shall a logarithmized grid version for the penalty parameter be used? Default is TRUE.

ran.seed

Random seed number to be set. Default is 1909, the year of birth of Borussia Dortmund football club.

xi.factor

A factor which increases xi.max once again to be sure that xi is large enough on all sets. Default is 1.01

min.fold

Only those xi values are taken into account where at least min.fold folds are not NA. Default is 4.

pass.on.start

Shall starting values be passed onthroughout estimation? Default is TRUE

control

a list of control values for the estimation algorithm to replace the default values returned by the function coxlassoControl. Default is print.iter = FALSE.

Details

The function runs coxlasso over a grid of values \xi for each training data set with one fold omitted.

For each run, the value for the full likelihood is calculated and the average for each \xi on the grid is computed over the folds. The function choses the \xi that maximizes this likelihood value as the optimal tuning parameter value.

Value

The function returns a list "cv.coxlasso" which includes:

cv.error

a vector of mean CV error (i.e., negative likelihood) values for each \xi on the grid averaged over the folds.

xi.opt

a scalar value of \xi associated with the smallest CV error.

xi.1se

largest value of \xi such that error is within 1 standard error of the minimum.

The plot function plots the values of \xi against the corresponding CV error (i.e., negative likelihood) values.

Author(s)

Andreas Groll groll@statistik.tu-dortmund.de
Maike Hohberg mhohber@uni-goettingen.de

References

To appear soon.

See Also

coxlasso, coxlassoControl, coxFL, Surv, pbc

Examples

## Not run: 
data(lung)

# remove NAs
lung <- lung[!is.na(lung$inst),]

# transform inst into factor variable
lung$inst <- as.factor(lung$inst)

# just for illustration, create factor with only three ph.ecog classes
lung$ph.ecog[is.na(lung$ph.ecog)] <- 2
lung$ph.ecog[lung$ph.ecog==3] <- 2
lung$ph.ecog <- as.factor(lung$ph.ecog)

fix.form <- as.formula("Surv(time, status) ~ 1 + age + ph.ecog + sex")

# find optimal tuning paramater
cv.coxlasso.obj <- cv.coxlasso(fix = fix.form, data = lung, n.folds = 5)

# estimate coxlasso model with optimal xi
lasso.obj <- coxlasso(fix=fix.form, data=lung, xi=cv.coxlasso.obj$xi.opt,
                control=list(print.iter=TRUE))

coef(lasso.obj)             
                

# see also demo("coxlasso-lung")

## End(Not run)

PenCoxFrail documentation built on Sept. 11, 2024, 7:12 p.m.