penalized: Penalized regression In luederm/penalizedcpp: L1 (Lasso and Fused Lasso) and L2 (Ridge) Penalized Estimation in GLMs and in the Cox Model

Description

Fitting generalized linear models with L1 (lasso and fused lasso) and/or L2 (ridge) penalties, or a combination of the two.

Usage

 ```1 2 3 4 5``` ```penalized (response, penalized, unpenalized, lambda1=0, lambda2=0, positive = FALSE, data, fusedl=FALSE, model = c("cox", "logistic", "linear", "poisson"), startbeta, startgamma, steps =1, epsilon = 1e-10, maxiter, standardize = FALSE, trace = TRUE) ```

Arguments

 `response` The response variable (vector). This should be a numeric vector for linear regression, a `Surv` object for Cox regression and `factor` or a vector of 0/1 values for logistic regression. `penalized` The penalized covariates. These may be specified either as a matrix or as a (one-sided) `formula` object. See also under `data`. `unpenalized` Additional unpenalized covariates. Specified as under `penalized`. Note that an unpenalized intercept is included in the model by default (except in the Cox model). This can be suppressed by specifying `unpenalized = ~0`. `lambda1, lambda2` The tuning parameters for L1 and L2 penalization. Each must be either a single positive numbers or a vector with length equal to the number of covariates in `penalized` argument. In the latter case, each covariate is given its own penalty weight. `positive` If `TRUE`, constrains the estimated regression coefficients of all penalized covariates to be non-negative. If a logical vector with the length of the number of covariates in `penalized`, constrains the estimated regression coefficients of a subset of the penalized covariates to be non-negative. `data` A `data.frame` used to evaluate `response`, and the terms of `penalized` or `unpenalized` when these have been specified as a `formula` object. `fusedl` If `TRUE` or a vector, the penalization method used is fused lasso. The value for `lambda1` is used as the tuning parameter for L1 penalization on the coefficients and the value for `lambda2` is used as the tuning parameter for L1 penalization on the differences of the coefficients. Default value is `FALSE`. `model` The model to be used. If missing, the model will be guessed from the `response` input. `startbeta` Starting values for the regression coefficients of the penalized covariates. `startgamma` Starting values for the regression coefficients of the unpenalized covariates. `steps` If greater than 1, the algorithm will fit the model for a range of `steps` `lambda1`-values, starting from the maximal value down to the value of `lambda1` specified. This is useful for making plots as in `plotpath`. With `steps = "Park"` it is possible to choose the steps in such a way that they are at the approximate value at which the active set changes, following Park and Haste (2007). `epsilon` The convergence criterion. As in `glm`. Convergence is judged separately on the likelihood and on the penalty. `maxiter` The maximum number of iterations allowed. Set by default at 25 when only an L2 penalty is present, infinite otherwise. `standardize` If `TRUE`, standardizes all penalized covariates to unit central L2-norm before applying penalization. `trace` If `TRUE`, prints progress information. Note that setting `trace=TRUE` may slow down the algorithm up to 30 percent (but it often feels quicker)

Details

The `penalized` function fits regression models for a given combination of L1 and L2 penalty parameters.

Value

`penalized` returns a `penfit` object when `steps = 1` or a list of such objects if `steps > 1`.

Note

The `response` argument of the function also accepts formula input as in `lm` and related functions. In that case, the right hand side of the `response` formula is used as the `penalized` argument or, if that is already given, as the `unpenalized` argument. For example, the input `penalized(y~x)` is equivalent to `penalized(y, ~x)` and `penalized(y~x, ~z)` to `penalized(y, ~z, ~x)`.

In case of tied survival times, the function uses Breslow's version of the partial likelihood.

Author(s)

Jelle Goeman: [email protected]

References

Goeman J.J. (2010). L-1 Penalized Estimation in the Cox Proportional Hazards Model. Biometrical Journal 52 (1) 70-84.

`penfit` for the `penfit` object returned, `plotpath` for plotting the solution path, and `cvl` for cross-validation and optimizing the tuning parameters.
 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29``` ```# More examples in the package vignette: # type vignette("penalized") data(nki70) # A single lasso fit predicting survival pen <- penalized(Surv(time, event), penalized = nki70[,8:77], unpenalized = ~ER+Age+Diam+N+Grade, data = nki70, lambda1 = 10) show(pen) coefficients(pen) coefficients(pen, "penalized") basehaz(pen) # A single lasso fit using the clinical risk factors pen <- penalized(Surv(time, event), penalized = ~ER+Age+Diam+N+Grade, data = nki70, lambda1=10, standardize=TRUE) # using steps pen <- penalized(Surv(time, event), penalized = nki70[,8:77], data = nki70, lambda1 = 1,steps = 20) plotpath(pen) # A fused lasso fit predicting survival pen <- penalized(Surv(time, event), penalized = nki70[,8:77], data = nki70, lambda1 = 1, lambda2 = 2, fusedl = TRUE) plot(coefficients(pen, "all"),type="l",xlab = "probes",ylab = "coefficient value") plot(predict(pen,penalized=nki70[,8:77])) ```