grpnet: fit a GLM with group lasso or group elastic-net...
In adelie: Group Lasso and Elastic Net Solver for Generalized Linear Models

grpnet

R Documentation

fit a GLM with group lasso or group elastic-net regularization

Description

Computes a group elastic-net regularization path for a variety of GLM and other families, including the Cox model. This function extends the abilities of the glmnet package to allow for grouped regularization. The code is very efficient (core routines are written in C++), and allows for specialized matrix classes.

Usage

grpnet(
  X,
  glm,
  constraints = NULL,
  groups = NULL,
  alpha = 1,
  penalty = NULL,
  offsets = NULL,
  lambda = NULL,
  standardize = TRUE,
  irls_max_iters = as.integer(10000),
  irls_tol = 1e-07,
  max_iters = as.integer(1e+05),
  tol = 1e-07,
  adev_tol = 0.9,
  ddev_tol = 0,
  newton_tol = 1e-12,
  newton_max_iters = 1000,
  n_threads = 1,
  early_exit = TRUE,
  intercept = TRUE,
  screen_rule = c("pivot", "strong"),
  min_ratio = 0.01,
  lmda_path_size = 100,
  max_screen_size = NULL,
  max_active_size = NULL,
  pivot_subset_ratio = 0.1,
  pivot_subset_min = 1,
  pivot_slack_ratio = 1.25,
  check_state = FALSE,
  progress_bar = FALSE,
  warm_start = NULL
)

Arguments

`X`	Feature matrix. Either a regular R matrix, or else an `adelie` custom matrix class, or a concatination of such.
`glm`	GLM family/response object. This is an expression that represents the family, the reponse and other arguments such as weights, if present. The choices are `glm.gaussian()`, `glm.binomial()`, `glm.poisson()`, `glm.multinomial()`, `glm.cox()`, `glm.multinomial()`, and `glm.multigaussian()`. This is a required argument, and there is no default. In the simple example below, we use `glm.gaussian(y)`.
`constraints`	Group-wise constraints on the parameters, supplied as a list with an element for each group. Default is `NULL`, which means no constraints. List elements can be `NULL` as well. Currently only 'box constraints' are supported, which means upper and lower limits. The function `constraint.box()` must be used to set the constraints for each group that has constraints. Details are given in the documentation for `constraint.box`.
`groups`	This is an ordered vector of integers that represents the groupings, with each entry indicating where a group begins. The entries refer to column numbers in the feature matrix, and hence the memebers of a group have to be contiguous. If there are `p` features, the default is `1:p` (no groups; i.e. `p` groups each of of size 1). So the length of `groups` is the number of groups. (Note that in the `state` output of `grpnet` this vector might be shifted to start from 0, since internally `adelie` uses zero-based indexing.)
`alpha`	The elasticnet mixing parameter, with `0\le\alpha\le 1`. The penalty is defined as `(1-\alpha)/2\sum_j\|\|\beta_j\|\|_2^2+\alpha\sum_j\|\|\beta_j\|\|_2,` where thte sum is over groups. `alpha=1` is pure group lasso penalty, and `alpha=0` the pure ridge penalty.
`penalty`	Separate penalty factors can be applied to each group of coefficients. This is a number that multiplies `lambda` to allow differential shrinkage for groups. Can be 0 for some groups, which implies no shrinkage, and that group is always included in the model. Default is square-root of group sizes for each group.
`offsets`	Offsets, default is `NULL`. If present, this is a fixed vector or matrix corresponding to the shape of the natural parameter, and is added to the fit.
`lambda`	A user supplied `lambda` sequence. Typical usage is to have the program compute its own `lambda` sequence based on `lmda_path_size` and `min_ratio`. This is returned with the fit.
`standardize`	If `TRUE` (the default), the columns of `X` are standardized before the fit is computed. This is good practice if the features are on different scales, because it has an impact on the penalty. The regularization path is computed using the standardized features, and the standardization information is saved on the object for making future predictions. The different matrix classes have their own methods for standardization. For example, for a sparse matrix the standardization information will be computed, but not actually applied (eg centering would destroy the sparsity). Rather, the methods for matrix multiply will be aware, and incorporate the standardization information.
`irls_max_iters`	Maximum number of IRLS iterations, default is `1e4`.
`irls_tol`	IRLS convergence tolerance, default is `1e-7`.
`max_iters`	Maximum total number of coordinate descent iterations, default is `1e5`.
`tol`	Coordinate descent convergence tolerance, default `1e-7`.
`adev_tol`	Fraction deviance explained tolerance, default `0.9`. This can be seen as a limit on overfitting the training data.
`ddev_tol`	Difference in fraction deviance explained tolerance, default `0`. If a step in the path changes the deviance by this amount or less, the algorithm truncates the path.
`newton_tol`	Convergence tolerance for the BCD update, default `1e-12`. This parameter controls the iterations in each block-coordinate step to establish the block solution.
`newton_max_iters`	Maximum number of iterations for the BCD update, default `1000`.
`n_threads`	Number of threads, default `1`.
`early_exit`	`TRUE` if the function should be allowed to exit early.
`intercept`	Default `TRUE` to include an unpenalized intercept.
`screen_rule`	Screen rule, with default `"pivot"`. Other option is `"strong"`. (an empirical improvement over `"strong"`, the other option.)
`min_ratio`	Ratio between smallest and largest value of lambda. Default is 1e-2.
`lmda_path_size`	Number of values for `lambda`, if generated automatically. Default is 100.
`max_screen_size`	Maximum number of screen groups. Default is `NULL`.
`max_active_size`	Maximum number of active groups. Default is `NULL`.
`pivot_subset_ratio`	Subset ratio of pivot rule. Default is `0.1`. Users not expected to fiddle with this.
`pivot_subset_min`	Minimum subset of pivot rule. Defaults is `1`. Users not expected to fiddle with this.
`pivot_slack_ratio`	Slack ratio of pivot rule, default is `1.25`. Users not expected to fiddle with this. See reference for details.
`check_state`	Check state. Internal parameter, with default `FALSE`.
`progress_bar`	Progress bar. Default is `FALSE`.
`warm_start`	Warm start (default is `NULL`). Internal parameter.

Value

A list of class "grpnet". This has a main component called state which represents the fitted path, and a few extra useful components such as the call, the family name, groups and group_sizes. Users are encouraged to use methods like predict(), coef(), print(), plot() etc to examine the object.

Author(s)

James Yang, Trevor Hastie, and Balasubramanian Narasimhan
Maintainer: Trevor Hastie hastie@stanford.edu

References

Yang, James and Hastie, Trevor. (2024) A Fast and Scalable Pathwise-Solver for Group Lasso and Elastic Net Penalized Regression via Block-Coordinate Descent. arXiv \Sexpr[results=rd]{tools:::Rd_expr_doi("10.48550/arXiv.2405.08631")}.
Friedman, J., Hastie, T. and Tibshirani, R. (2008) Regularization Paths for Generalized Linear Models via Coordinate Descent (2010), Journal of Statistical Software, Vol. 33(1), 1-22, \Sexpr[results=rd]{tools:::Rd_expr_doi("10.18637/jss.v033.i01")}.
Simon, N., Friedman, J., Hastie, T. and Tibshirani, R. (2011) Regularization Paths for Cox's Proportional Hazards Model via Coordinate Descent, Journal of Statistical Software, Vol. 39(5), 1-13, \Sexpr[results=rd]{tools:::Rd_expr_doi("10.18637/jss.v039.i05")}.
Tibshirani,Robert, Bien, J., Friedman, J., Hastie, T.,Simon, N., Taylor, J. and Tibshirani, Ryan. (2012) Strong Rules for Discarding Predictors in Lasso-type Problems, JRSSB, Vol. 74(2), 245-266, https://arxiv.org/abs/1011.2234.

Examples

set.seed(0)
n <- 100
p <- 200
X <- matrix(rnorm(n * p), n, p)
y <- X[,1] * rnorm(1) + rnorm(n)
## Here we create 60 groups randomly. Groups need to be contiguous, and the `groups` variable
## indicates the beginning position of each group.
groups <- c(1, sample(2:199, 60, replace = FALSE))
groups <- sort(groups)
print(groups)
fit <- grpnet(X, glm.gaussian(y), groups = groups)
print(fit)
plot(fit)
coef(fit)
cvfit  <- cv.grpnet(X, glm.gaussian(y), groups = groups)
print(cvfit)
plot(cvfit)
predict(cvfit,newx=X[1:5,], lambda="lambda.min")

adelie documentation built on April 3, 2025, 8:58 p.m.