grpnet: fit a GLM with group lasso or group elastic-net...

View source: R/solver.R

grpnetR Documentation

fit a GLM with group lasso or group elastic-net regularization

Description

Computes a group elastic-net regularization path for a variety of GLM and other families, including the Cox model. This function extends the abilities of the glmnet package to allow for grouped regularization. The code is very efficient (core routines are written in C++), and allows for specialized matrix classes.

Usage

grpnet(
  X,
  glm,
  constraints = NULL,
  groups = NULL,
  alpha = 1,
  penalty = NULL,
  offsets = NULL,
  lambda = NULL,
  standardize = TRUE,
  irls_max_iters = as.integer(10000),
  irls_tol = 1e-07,
  max_iters = as.integer(1e+05),
  tol = 1e-07,
  adev_tol = 0.9,
  ddev_tol = 0,
  newton_tol = 1e-12,
  newton_max_iters = 1000,
  n_threads = 1,
  early_exit = TRUE,
  intercept = TRUE,
  screen_rule = c("pivot", "strong"),
  min_ratio = 0.01,
  lmda_path_size = 100,
  max_screen_size = NULL,
  max_active_size = NULL,
  pivot_subset_ratio = 0.1,
  pivot_subset_min = 1,
  pivot_slack_ratio = 1.25,
  check_state = FALSE,
  progress_bar = FALSE,
  warm_start = NULL
)

Arguments

X

Feature matrix. Either a regualr R matrix, or else an adelie custom matrix class, or a concatination of such.

glm

GLM family/response object. This is an expression that represents the family, the reponse and other arguments such as weights, if present. The choices are glm.gaussian(), glm.binomial(), glm.poisson(), glm.multinomial(), glm.cox(), glm.multinomial(), and glm.multigaussian(). This is a required argument, and there is no default. In the simple example below, we use glm.gaussian(y).

constraints

Constraints on the parameters. Currently these are ignored.

groups

This is an ordered vector of integers that represents the groupings, with each entry indicating where a group begins. The entries refer to column numbers in the feature matrix. If there are p features, the default is 1:p (no groups). (Note that in the output of grpnet this vector might be shifted to start from 0, since internally adelie uses zero-based indexing.)

alpha

The elasticnet mixing parameter, with 0\le\alpha\le 1. The penalty is defined as

(1-\alpha)/2\sum_j||\beta_j||_2^2+\alpha\sum_j||\beta_j||_2,

where thte sum is over groups. alpha=1 is pure group lasso penalty, and alpha=0 the pure ridge penalty.

penalty

Separate penalty factors can be applied to each group of coefficients. This is a number that multiplies lambda to allow differential shrinkage for groups. Can be 0 for some groups, which implies no shrinkage, and that group is always included in the model. Default is square-root of group sizes for each group.

offsets

Offsets, default is NULL. If present, this is a fixed vector or matrix corresponding to the shape of the natural parameter, and is added to the fit.

lambda

A user supplied lambda sequence. Typical usage is to have the program compute its own lambda sequence based on lmda_path_size and min_ratio.

standardize

If TRUE (the default), the columns of X are standardized before the fit is computed. This is good practice if the features are a mixed bag, because it has an impact on the penalty. The regularization path is computed using the standardized features, and the standardization information is saved on the object for making future predictions.

irls_max_iters

Maximum number of IRLS iterations, default is 1e4.

irls_tol

IRLS convergence tolerance, default is 1e-7.

max_iters

Maximum total number of coordinate descent iterations, default is 1e5.

tol

Coordinate descent convergence tolerance, default 1e-7.

adev_tol

Fraction deviance explained tolerance, default 0.9. This can be seen as a limit on overfitting the training data.

ddev_tol

Difference in fraction deviance explained tolerance, default 0. If a step in the path changes the deviance by this amount or less, the algorithm truncates the path.

newton_tol

Convergence tolerance for the BCD update, default 1e-12. This parameter controls the iterations in each block-coordinate step to establish the block solution.

newton_max_iters

Maximum number of iterations for the BCD update, default 1000.

n_threads

Number of threads, default 1.

early_exit

TRUE if the function should be allowed to exit early.

intercept

Default TRUE to include an unpenalized intercept.

screen_rule

Screen rule, with default "pivot". Other option is "strong". (an empirical improvement over "strong", the other option.)

min_ratio

Ratio between smallest and largest value of lambda. Default is 1e-2.

lmda_path_size

Number of values for lambda, if generated automatically. Default is 100.

max_screen_size

Maximum number of screen groups. Default is NULL.

max_active_size

Maximum number of active groups. Default is NULL.

pivot_subset_ratio

Subset ratio of pivot rule. Default is 0.1. Users not expected to fiddle with this.

pivot_subset_min

Minimum subset of pivot rule. Defaults is 1. Users not expected to fiddle with this.

pivot_slack_ratio

Slack ratio of pivot rule, default is 1.25. Users not expected to fiddle with this. See reference for details.

check_state

Check state. Internal parameter, with default FALSE.

progress_bar

Progress bar. Default is FALSE.

warm_start

Warm start (default is NULL). Internal parameter.

Value

A list of class "grpnet". This has a main component called state which represents the fitted path, and a few extra useful components such as the call, the family name, and group_sizes. Users typically use methods like predict(), print(), plot() etc to examine the object.

Author(s)

James Yang, Trevor Hastie, and Balasubramanian Narasimhan
Maintainer: Trevor Hastie hastie@stanford.edu

References

Yang, James and Hastie, Trevor. (2024) A Fast and Scalable Pathwise-Solver for Group Lasso and Elastic Net Penalized Regression via Block-Coordinate Descent. arXiv \Sexpr[results=rd]{tools:::Rd_expr_doi("10.48550/arXiv.2405.08631")}.
Friedman, J., Hastie, T. and Tibshirani, R. (2008) Regularization Paths for Generalized Linear Models via Coordinate Descent (2010), Journal of Statistical Software, Vol. 33(1), 1-22, \Sexpr[results=rd]{tools:::Rd_expr_doi("10.18637/jss.v033.i01")}.
Simon, N., Friedman, J., Hastie, T. and Tibshirani, R. (2011) Regularization Paths for Cox's Proportional Hazards Model via Coordinate Descent, Journal of Statistical Software, Vol. 39(5), 1-13, \Sexpr[results=rd]{tools:::Rd_expr_doi("10.18637/jss.v039.i05")}.
Tibshirani,Robert, Bien, J., Friedman, J., Hastie, T.,Simon, N.,Taylor, J. and Tibshirani, Ryan. (2012) Strong Rules for Discarding Predictors in Lasso-type Problems, JRSSB, Vol. 74(2), 245-266, https://arxiv.org/abs/1011.2234.

See Also

cv.grpnet, predict.grpnet, plot.grpnet, print.grpnet.

Examples

set.seed(0)
n <- 100
p <- 200
X <- matrix(rnorm(n * p), n, p)
y <- X[,1] * rnorm(1) + rnorm(n)
fit <- grpnet(X, glm.gaussian(y))
print(fit)


adelie documentation built on Sept. 11, 2024, 6:36 p.m.