cv.grpnet: Cross-Validation for grpnet

View source: R/cv.grpnet.R

cv.grpnetR Documentation

Cross-Validation for grpnet

Description

Implements k-fold cross-validation for grpnet to find the regularization parameters that minimize the prediction error (deviance, mean squared error, mean absolute error, or misclassification rate).

Usage

cv.grpnet(x, ...)

## Default S3 method:
cv.grpnet(x, 
          y, 
          group,
          weights = NULL,
          offset = NULL,
          alpha = c(0.01, 0.25, 0.5, 0.75, 1),
          gamma = c(3, 4, 5),
          type.measure = NULL,
          nfolds = 10, 
          foldid = NULL,
          same.lambda = FALSE,
          parallel = FALSE, 
          cluster = NULL, 
          verbose = interactive(), 
          adaptive = FALSE,
          power = 1,
          ...)
           
## S3 method for class 'formula'
cv.grpnet(formula,
          data, 
          use.rk = TRUE,
          weights = NULL,
          offset = NULL,
          alpha = c(0.01, 0.25, 0.5, 0.75, 1),
          gamma = c(3, 4, 5),
          type.measure = NULL,
          nfolds = 10, 
          foldid = NULL, 
          same.lambda = FALSE,
          parallel = FALSE, 
          cluster = NULL, 
          verbose = interactive(), 
          adaptive = FALSE,
          power = 1,
          ...)

Arguments

x

Model (design) matrix of dimension nobs by nvars (n \times p).

y

Response vector of length n. Matrix inputs are allowed for binomial and multinomial families (see "Binomial and multinomial" section in grpnet).

group

Group label vector (factor, character, or integer) of length p. Predictors with the same label are grouped together for regularization.

formula

Model formula: a symbolic description of the model to be fitted. Uses the same syntax as lm and glm.

data

Optional data frame containing the variables referenced in formula.

use.rk

If TRUE (default), the rk.model.matrix function is used to build the model matrix. Otherwise, the model.matrix function is used to build the model matrix. Additional arguments to the rk.model.matrix function can be passed via the ... argument.

weights

Optional vector of length n with non-negative weights to use for weighted (penalized) likelihood estimation. Defaults to a vector of ones.

offset

Optional vector of length n with an a priori known term to be included in the model's linear predictor. Defaults to a vector of zeros.

alpha

Scalar or vector specifying the elastic net tuning parameter \alpha. If alpha is a vector (default), then (a) the same foldid is used to compute the cross-validation error for each \alpha, and (b) the solution for the optimal \alpha is returned.

gamma

Scalar or vector specifying the penalty hyperparameter \gamma for MCP or SCAD. If gamma is a vector (default), then (a) the same foldid is used to compute the cross-validation error for each \gamma, and (b) the solution for the optimal \gamma is returned.

type.measure

Loss function for cross-validation. Options include: "deviance" for model deviance, "mse" for mean squared error, "mae" for mean absolute error, or "class" for classification error. Note that "class" is only available for binomial and multinomial families. The default is classification error (for binomial and multinomial) or deviance (others).

nfolds

Number of folds for cross-validation.

foldid

Optional vector of length n giving the fold identification for each observation. Must be coercible into a factor. After coersion, the nfolds argument is defined as nfolds = nlevels(foldid).

same.lambda

Logical specfying if the same \lambda sequence should be used for fitting the model to each fold's data. If FALSE (default), the \lambda sequence is determined separately holding out each fold, and the \lambda sequence from the full model is used to align the predictions. If TRUE, the \lambda sequence from the full model is used to fit the model for each fold. The default often provides better (i.e., more stable) computational performance.

parallel

Logical specifying if sequential computing (default) or parallel computing should be used. If TRUE, the fitting for each fold is parallelized.

cluster

Optional cluster to use for parallel computing. If parallel = TRUE and cluster = NULL, then the cluster is defined cluster = makeCluster(2L), which uses two cores. Recommended usage: cluster = makeCluster(detectCores())

verbose

Logical indicating if the fitting progress should be printed. Defaults to TRUE in interactive sessions and FALSE otherwise.

adaptive

Logical indicating if the adaptive group elastic net should be used (see Note).

power

If adaptive = TRUE, then the adaptive penalty weights are defined by dividing the original penalty weights by tapply(coef, group, norm, type = "F")^power.

...

Optional additional arguments for grpnet (e.g., standardize, penalty.factor, etc.)

Details

This function calls the grpnet function nfolds+1 times: once on the full dataset to obtain the lambda sequence, and once holding out each fold's data to evaluate the prediction error. The syntax of (the default S3 method for) this function closely mimics that of the cv.glmnet function in the glmnet package (Friedman, Hastie, & Tibshirani, 2010).

Let \mathbf{D}_u = \{\mathbf{y}_u, \mathbf{X}_u\} denote the u-th fold's data, let \mathbf{D}_{[u]} = \{\mathbf{y}_{[u]}, \mathbf{X}_{[u]}\} denote the full dataset excluding the u-th fold's data, and let \boldsymbol\beta_{\lambda [u]} denote the coefficient estimates obtained from fitting the model to \mathbf{D}_{[u]} using the regularization parameter \lambda.

The cross-validation error for the u-th fold is defined as

E_u(\lambda) = C(\boldsymbol\beta_{\lambda [u]} , \mathbf{D}_u)

where C(\cdot , \cdot) denotes the cross-validation loss function that is specified by type.measure. For example, the "mse" loss function is defined as

C(\boldsymbol\beta_{\lambda [u]} , \mathbf{D}_u) = \| \mathbf{y}_u - \mathbf{X}_u \boldsymbol\beta_{\lambda [u]} \|^2

where \| \cdot \| denotes the L2 norm.

The mean cross-validation error cvm is defined as

\bar{E}(\lambda) = \frac{1}{v} \sum_{u = 1}^v E_u(\lambda)

where v is the total number of folds. The standard error cvsd is defined as

S(\lambda) = \sqrt{ \frac{1}{v (v - 1)} \sum_{u=1}^v (E_u(\lambda) - \bar{E}(\lambda))^2 }

which is the classic definition of the standard error of the mean.

Value

lambda

regularization parameter sequence for the full data

cvm

mean cross-validation error for each lambda

cvsd

estimated standard error of cvm

cvup

upper curve: cvm + cvsd

cvlo

lower curve: cvm - cvsd

nzero

number of non-zero groups for each lambda

grpnet.fit

fitted grpnet object for the full data

lambda.min

value of lambda that minimizes cvm

lambda.1se

largest lambda such that cvm is within one cvsd from the minimum (see Note)

index

two-element vector giving the indices of lambda.min and lambda.1se in the lambda vector, i.e., c(minid, se1id) as defined in the Note

type.measure

loss function for cross-validation (used for plot label)

call

matched call

time

runtime in seconds to perform k-fold CV tuning

tune

data frame containing the tuning results, i.e., min(cvm) for each combo of alpha and/or gamma

Note

When adaptive = TRUE, the adaptive group elastic net is used:
(1) an initial fit with alpha = 0 estimates the penalty.factor
(2) a second fit using estimated penalty.factor is returned

lambda.1se is defined as follows:
minid <- which.min(cvm)
min1se <- cvm[minid] + cvsd[minid]
se1id <- which(cvm <= min1se)[1]
lambda.1se <- lambda[se1id]

Author(s)

Nathaniel E. Helwig <helwig@umn.edu>

References

Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization paths for generalized linear models via coordinate descent. Journal of Statistical Software, 33(1), 1-22. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.18637/jss.v033.i01")}

Helwig, N. E. (2024). Versatile descent algorithms for group regularization and variable selection in generalized linear models. Journal of Computational and Graphical Statistics. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1080/10618600.2024.2362232")}

See Also

plot.cv.grpnet for plotting the cross-validation error curve

predict.cv.grpnet for predicting from cv.grpnet objects

grpnet for fitting group elastic net regularization paths

Examples


######***######   family = "gaussian"   ######***######

# load data
data(auto)

# 10-fold cv (formula method, response = mpg)
set.seed(1)
mod <- cv.grpnet(mpg ~ ., data = auto)

# print min and 1se solution info
mod

# plot cv error curve
plot(mod)



######***######   family = "binomial"   ######***######

# load data
data(auto)

# redefine origin (Domestic vs Foreign)
auto$origin <- ifelse(auto$origin == "American", "Domestic", "Foreign")

# 10-fold cv (default method, response = origin with 2 levels)
set.seed(1)
mod <- cv.grpnet(origin ~ ., data = auto, family = "binomial")

# print min and 1se solution info
mod

# plot cv error curve
plot(mod)



######***######   family = "multinomial"   ######***######

# load data
data(auto)

# 10-fold cv (formula method, response = origin with 3 levels)
set.seed(1)
mod <- cv.grpnet(origin ~ ., data = auto, family = "multinomial")

# print min and 1se solution info
mod

# plot cv error curve
plot(mod)



######***######   family = "poisson"   ######***######

# load data
data(auto)

# 10-fold cv (formula method, response = horsepower)
set.seed(1)
mod <- cv.grpnet(horsepower ~ ., data = auto, family = "poisson")

# print min and 1se solution info
mod

# plot cv error curve
plot(mod)



######***######   family = "negative.binomial"   ######***######

# load data
data(auto)

# 10-fold cv (formula method, response = horsepower)
set.seed(1)
mod <- cv.grpnet(horsepower ~ ., data = auto, family = "negative.binomial")

# print min and 1se solution info
mod

# plot cv error curve
plot(mod)



######***######   family = "Gamma"   ######***######

# load data
data(auto)

# 10-fold cv (formula method, response = origin)
set.seed(1)
mod <- cv.grpnet(mpg ~ ., data = auto, family = "Gamma")

# print min and 1se solution info
mod

# plot cv error curve
plot(mod)



######***######   family = "inverse.gaussian"   ######***######

# load data
data(auto)

# 10-fold cv (formula method, response = origin)
set.seed(1)
mod <- cv.grpnet(mpg ~ ., data = auto, family = "inverse.gaussian")

# print min and 1se solution info
mod

# plot cv error curve
plot(mod)



grpnet documentation built on Oct. 12, 2024, 1:07 a.m.

Related to cv.grpnet in grpnet...