# pencoxfrail: Regularization in Cox Frailty Models. In PenCoxFrail: Regularization in Cox Frailty Models

## Description

A regularization approach for Cox Frailty Models by penalization methods is provided.

## Usage

 ```1 2``` ```pencoxfrail(fix=formula, rnd=formula, vary.coef=formula, data, xi, adaptive.weights = NULL, control = list()) ```

## Arguments

 `fix` a two-sided linear formula object describing the unpenalized fixed (time-constant) effects part of the model, with the response on the left of a `~` operator and the terms, separated by `+` operators, on the right. The response must be a survival object as returned by the `Surv` function. `rnd` a two-sided linear formula object describing the random-effects part of the model, with the grouping factor on the left of a `~` operator and the random terms, separated by `+` operators, on the right. `vary.coef` a one-sided linear formula object describing the time-varying effects part of the model, with the time-varying terms, separated by `+` operators, on the right side of a `~` operator. `data` the data frame containing the variables named in the three preceding `formula` arguments. `xi` the overall penalty parameter that controls the strenght of both penalty terms in ξ*J(ζ,α) and, hence, controls the overall amount of smoothness (up to constant effects) and variable selection for a given proportion ζ. The optimal penalty parameter is a tuning parameter of the procedure that has to be determined, e.g. by K-fold cross validation. (See details or the quick demo for an example.) `adaptive.weights` a two-column matrix of adaptive weights passed to the procedure; the first column contains the weights w_k, the second column the weights v_k from ξ*J(ζ,α). If no adaptive weights are specified all weights are set to one. The recommended strategy is to first fit an unpenalized model (i.e. ξ=0) and then use the obtained adaptive weights (see value section) when fitting the model for all other combinations of ξ and ζ. `control` a list of control values for the estimation algorithm to replace the default values returned by the function `pencoxfrailControl`. Defaults to an empty list.

## Details

The `pencoxfrail` algorithm is designed to investigate the effect structure in the Cox frailty model, which is a widely used model that accounts for heterogeneity in survival data. Since in survival models one has to account for possible variation of the effect strength over time the selection of the relevant features distinguishes between the folllowing cases: covariates can have time-varying effects, can have time-constant effects or be irrelevant. For this purpose, the following specific penality is applied on the vectors of B-spline coefficients α_k, assuming k=1,...,r different, potentially time-varying effects, each expanded in M B-spline basis functions:

ξ*J(ζ,α) = ξ * { ζ * ∑_k ψ * w_k * ||Δ_M*α_k||_2 + (1-ζ) * ∑_k φ* v_k * ||α_k||_2 }.

This penalty is able to distinguish between these types of effects to obtain a sparse representation that includes the relevant effects in a proper form.

The penalty is depending on two tuning parameters, ξ and ζ, which have to be determined by a suitable technique, e.g. by (2-dimensional) K-fold cross validation.

The first term of the penalty controls the smoothness of the time-varying covariate effects, whereby for values of ξ and ζ large enough, all differences (α_k,l - α_k,l-1), l=2,... ,M, are removed from the model, resulting in constant covariate effects. As the B-splines of each variable with varying coefficients sum up to one, a constant effect is obtained if all spline coefficients are set equal. Hence, the first penalty term does not affect the spline's global level. The second term penalizes all spline coefficients belonging to a single time-varying effect in the way of a group LASSO and, hence, controls the selection of covariates.

 Package: pencoxfrail Type: Package Version: 1.0.1 Date: 2016-05-06 License: GPL-2 LazyLoad: yes

## Value

Generic functions such as `print`, `predict`, `plot` and `summary` have methods to show the results of the fit.

The `predict` function uses also estimates of random effects for prediction, if possible (i.e. for known subjects of the grouping factor). Either the survival stepfunction or the baseline hazard (not cumulative!) can be calculated by specifying one of two possible methods: `method=c("hazard","survival")`. By default, for each new subject in `new.data` an individual stepfunction is calculated on a pre-specified time grid, also accounting for covariate changes over time. Alternatively, for `new.data` a single vector of a specific (time-constant) covariate combination can be specified.

Usage: ``` predict(pencoxfrail.obj,new.data,time.grid,method=c("hazard","survival")) ```

The `plot` function plots all time-varying effects, including the baseline hazard.

 `call` a list containing an image of the `pencoxfrail` call that produced the object. `baseline` a vector containing the estimated B-spline coefficients of the baseline hazard. If the covariates corresponding to the time-varying effects are centered (and standardized, see `pencoxfrailControl`), the coefficients are transformed back to the original scale. `time.vary` a vector containing the estimated B-spline coefficients of all time-varying effects. If the covariates corresponding to the time-varying effects are standardized (see `pencoxfrailControl`) the coefficients are transformed back to the original scale. `coefficients` a vector containing the estimated fixed effects. `ranef` a vector containing the estimated random effects. `Q` a scalar or matrix containing the estimates of the random effects standard deviation or variance-covariance parameters, respectively. `Delta` a matrix containing the estimates of fixed and random effects (columns) for each iteration (rows) of the main algorithm (i.e. before the final re-estimation step is performed, see details). `Q_long` a list containing the estimates of the random effects variance-covariance parameters for each iteration of the main algorithm. `iter` number of iterations until the main algorithm has converged. `adaptive.weights` If ξ=0, a two-column matrix of adaptive weights is calculated; the first column contains the weights w_k, the second column the weights v_k from ξ*J(ζ,α). If ξ>0, the adaptive weights that have been used in the function's argument are displayed. `knots` vector of knots used in the B-spline representation. `Phi.big` large B-spline design matrix corresponding to the baseline hazard and all time-varying effects. For the time-varying effects, the B-spline functions (as a function of time) have already been multiplied with their associated covariates. `time.grid` the time grid used in when approximating the (Riemann) integral involved in the model's full likelihood. `m` number of metric covariates with time-varying effects. `m2` number of categorical covariates with time-varying effects.

## Author(s)

Andreas Groll [email protected]

## References

Groll, A., T. Hastie and G. Tutz (2016). Regularization in Cox Frailty Models. Ludwig-Maximilians-University. Technical Report 191.

`pencoxfrailControl,Surv,pbc`
 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32``` ```## Not run: data(lung) # remove NAs lung <- lung[!is.na(lung\$inst),] # transform inst into factor variable lung\$inst <- as.factor(lung\$inst) # Random institutional effect fix.form <- as.formula("Surv(time, status) ~ 1") vary.coef <- as.formula("~ age") pen.obj <- pencoxfrail(fix=fix.form,vary.coef=vary.coef, rnd = list(inst=~1), data=lung, xi=10,control=list(print.iter=TRUE)) # show fit plot(pen.obj) # predict survival curve of new subject, institution 1 and up to time 500 pred.obj <- predict(pen.obj,newdata=data.frame(inst=1,time=NA,status=NA,age=26), time.grid=seq(0,500,by=1)) # plot predicted hazard function plot(pred.obj\$time.grid,pred.obj\$haz,type="l",xlab="time",ylab="hazard") # plot predicted survival function plot(pred.obj\$time.grid,pred.obj\$survival,type="l",xlab="time",ylab="survival") # see also demo("pencoxfrail-pbc") ## End(Not run) ```