penalized: Penalized regression
In penalized: L1 (Lasso and Fused Lasso) and L2 (Ridge) Penalized Estimation in GLMs and in the Cox Model

View source: R/penalized.R

Penalized generalized linear models

R Documentation

Penalized regression

Description

Fitting generalized linear models with L1 (lasso and fused lasso) and/or L2 (ridge) penalties, or a combination of the two.

Usage


penalized (response, penalized, unpenalized, lambda1=0, 
  lambda2=0, positive = FALSE, data, fusedl=FALSE,
  model = c("cox", "logistic", "linear", "poisson"), 
  startbeta, startgamma, steps =1, epsilon = 1e-10, 
  maxiter, standardize = FALSE, trace = TRUE)

Arguments

`response`	The response variable (vector). This should be a numeric vector for linear regression, a `Surv` object for Cox regression and `factor` or a vector of 0/1 values for logistic regression.
`penalized`	The penalized covariates. These may be specified either as a matrix or as a (one-sided) `formula` object. See also under `data`.
`unpenalized`	Additional unpenalized covariates. Specified as under `penalized`. Note that an unpenalized intercept is included in the model by default (except in the Cox model). This can be suppressed by specifying `unpenalized = ~0`.
`lambda1, lambda2`	The tuning parameters for L1 and L2 penalization. Each must be either a single positive numbers or a vector with length equal to the number of covariates in `penalized` argument. In the latter case, each covariate is given its own penalty weight.
`positive`	If `TRUE`, constrains the estimated regression coefficients of all penalized covariates to be non-negative. If a logical vector with the length of the number of covariates in `penalized`, constrains the estimated regression coefficients of a subset of the penalized covariates to be non-negative.
`data`	A `data.frame` used to evaluate `response`, and the terms of `penalized` or `unpenalized` when these have been specified as a `formula` object.
`fusedl`	If `TRUE` or a vector, the penalization method used is fused lasso. The value for `lambda1` is used as the tuning parameter for L1 penalization on the coefficients and the value for `lambda2` is used as the tuning parameter for L1 penalization on the differences of the coefficients. Default value is `FALSE`.
`model`	The model to be used. If missing, the model will be guessed from the `response` input.
`startbeta`	Starting values for the regression coefficients of the penalized covariates.
`startgamma`	Starting values for the regression coefficients of the unpenalized covariates.
`steps`	If greater than 1, the algorithm will fit the model for a range of `steps` `lambda1`-values, starting from the maximal value down to the value of `lambda1` specified. This is useful for making plots as in `plotpath`. With `steps = "Park"` it is possible to choose the steps in such a way that they are at the approximate value at which the active set changes, following Park and Haste (2007).
`epsilon`	The convergence criterion. As in `glm`. Convergence is judged separately on the likelihood and on the penalty.
`maxiter`	The maximum number of iterations allowed. Set by default at 25 when only an L2 penalty is present, infinite otherwise.
`standardize`	If `TRUE`, standardizes all penalized covariates to unit central L2-norm before applying penalization.
`trace`	If `TRUE`, prints progress information. Note that setting `trace=TRUE` may slow down the algorithm up to 30 percent (but it often feels quicker)

Details

The penalized function fits regression models for a given combination of L1 and L2 penalty parameters.

Value

penalized returns a penfit object when steps = 1 or a list of such objects if steps > 1.

Note

The response argument of the function also accepts formula input as in lm and related functions. In that case, the right hand side of the response formula is used as the penalized argument or, if that is already given, as the unpenalized argument. For example, the input penalized(y~x) is equivalent to penalized(y, ~x) and penalized(y~x, ~z) to penalized(y, ~z, ~x).

In case of tied survival times, the function uses Breslow's version of the partial likelihood.

Author(s)

Jelle Goeman: j.j.goeman@lumc.nl

References

Goeman J.J. (2010). L-1 Penalized Estimation in the Cox Proportional Hazards Model. Biometrical Journal 52 (1) 70-84.

Examples

# More examples in the package vignette:
#  type vignette("penalized")

data(nki70)

# A single lasso fit predicting survival
pen <- penalized(Surv(time, event), penalized = nki70[,8:77],
    unpenalized = ~ER+Age+Diam+N+Grade, data = nki70, lambda1 = 10)
show(pen)
coefficients(pen)
coefficients(pen, "penalized")
basehaz(pen)

# A single lasso fit using the clinical risk factors
pen <- penalized(Surv(time, event), penalized = ~ER+Age+Diam+N+Grade,
    data = nki70, lambda1=10, standardize=TRUE)

# using steps
pen <- penalized(Surv(time, event), penalized = nki70[,8:77],
    data = nki70, lambda1 = 1,steps = 20)
plotpath(pen)


# A fused lasso fit predicting survival
pen <- penalized(Surv(time, event), penalized = nki70[,8:77], data = nki70, 
     lambda1 = 1, lambda2 = 2, fusedl = TRUE)
plot(coefficients(pen, "all"),type="l",xlab = "probes",ylab = "coefficient value")
plot(predict(pen,penalized=nki70[,8:77]))

penalized documentation built on April 23, 2022, 5:05 p.m.