predict.cv.grpnet: Predict Method for cv.grpnet Fits

View source: R/predict.cv.grpnet.R

predict.cv.grpnetR Documentation

Predict Method for cv.grpnet Fits

Description

Obtain predictions from a cross-validated group elastic net regularized GLM (cv.grpnet) object.

Usage

## S3 method for class 'cv.grpnet'
predict(object, 
        newx,
        newdata,
        s = c("lambda.1se", "lambda.min"),
        type = c("link", "response", "class", "terms", 
                 "importance", "coefficients", "nonzero", "groups", 
                 "ncoefs", "ngroups", "norm", "znorm"),
        ...)

Arguments

object

Object of class "cv.grpnet"

newx

Matrix of new x scores for prediction (default S3 method). Must have p columns arranged in the same order as the x matrix used to fit the model.

newdata

Data frame of new data scores for prediction (S3 "formula" method). Must contain all variables in the formula used to fit the model.

s

Lambda value(s) at which predictions should be obtained. Can input a character ("lambda.min" or "lambda.1se") or a numeric vector. Default of "lambda.min" uses the lambda value that minimizes the mean cross-validated error.

type

Type of prediction to return. "link" gives predictions on the link scale (\eta). "response" gives predictions on the mean scale (\mu). "class" gives predicted class labels (for "binomial" and "multinomial" families). "terms" gives the predictions for each term (group) in the model (\eta_k). "importance" gives the variable importance index for each term (group) in the model. "coefficients" returns the coefficients used for predictions. "nonzero" returns a list giving the indices of non-zero coefficients for each s. "groups" returns a list giving the labels of non-zero groups for each s. "ncoefs" returns the number of non-zero coefficients for each s. "ngroups" returns the number of non-zero groups for each s. "norm" returns the L2 norm of each group's (raw) coefficients for each s. "znorm" returns the L2 norm of each group's standardized coefficients for each s.

...

Additional arguments (ignored)

Details

Predictions are calculated from the grpnet object fit to the full sample of data, which is stored as object$grpnet.fit

See predict.grpnet for further details on the calculation of the different types of predictions.

Value

Depends on three factors...
1. the exponential family distribution
2. the length of the input s
3. the type of prediction requested

See predict.grpnet for details

Note

Syntax is inspired by the predict.cv.glmnet function in the glmnet package (Friedman, Hastie, & Tibshirani, 2010).

Author(s)

Nathaniel E. Helwig <helwig@umn.edu>

References

Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization paths for generalized linear models via coordinate descent. Journal of Statistical Software, 33(1), 1-22. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.18637/jss.v033.i01")}

Helwig, N. E. (2024). Versatile descent algorithms for group regularization and variable selection in generalized linear models. Journal of Computational and Graphical Statistics. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1080/10618600.2024.2362232")}

See Also

cv.grpnet for k-fold cross-validation of lambda

predict.grpnet for predicting from grpnet objects

Examples


######***######   family = "gaussian"   ######***######

# load data
data(auto)

# 10-fold cv (formula method, response = mpg)
set.seed(1)
mod <- cv.grpnet(mpg ~ ., data = auto)

# get fitted values at "lambda.1se"
fit.1se <- predict(mod, newdata = auto)

# get fitted values at "lambda.min"
fit.min <- predict(mod, newdata = auto, s = "lambda.min")

# compare mean absolute error for two solutions
mean(abs(auto$mpg - fit.1se))
mean(abs(auto$mpg - fit.min))



######***######   family = "binomial"   ######***######

# load data
data(auto)

# redefine origin (Domestic vs Foreign)
auto$origin <- ifelse(auto$origin == "American", "Domestic", "Foreign")

# 10-fold cv (default method, response = origin with 2 levels)
set.seed(1)
mod <- cv.grpnet(origin ~ ., data = auto, family = "binomial")

# get predicted classes at "lambda.1se"
fit.1se <- predict(mod, newdata = auto, type = "class")

# get predicted classes at "lambda.min"
fit.min <- predict(mod, newdata = auto, type = "class", s = "lambda.min")

# compare misclassification rate for two solutions
1 - mean(auto$origin == fit.1se)
1 - mean(auto$origin == fit.min)



######***######   family = "multinomial"   ######***######

# load data
data(auto)

# 10-fold cv (formula method, response = origin with 3 levels)
set.seed(1)
mod <- cv.grpnet(origin ~ ., data = auto, family = "multinomial")

# get predicted classes at "lambda.1se"
fit.1se <- predict(mod, newdata = auto, type = "class")

# get predicted classes at "lambda.min"
fit.min <- predict(mod, newdata = auto, type = "class", s = "lambda.min")

# compare misclassification rate for two solutions
1 - mean(auto$origin == fit.1se)
1 - mean(auto$origin == fit.min)



######***######   family = "poisson"   ######***######

# load data
data(auto)

# 10-fold cv (formula method, response = horsepower)
set.seed(1)
mod <- cv.grpnet(horsepower ~ ., data = auto, family = "poisson")

# get fitted values at "lambda.1se"
fit.1se <- predict(mod, newdata = auto, type = "response")

# get fitted values at "lambda.min"
fit.min <- predict(mod, newdata = auto, type = "response", s = "lambda.min")

# compare mean absolute error for two solutions
mean(abs(auto$horsepower - fit.1se))
mean(abs(auto$horsepower - fit.min))



######***######   family = "negative.binomial"   ######***######

# load data
data(auto)

# 10-fold cv (formula method, response = horsepower)
set.seed(1)
mod <- cv.grpnet(horsepower ~ ., data = auto, family = "negative.binomial")

# get fitted values at "lambda.1se"
fit.1se <- predict(mod, newdata = auto, type = "response")

# get fitted values at "lambda.min"
fit.min <- predict(mod, newdata = auto, type = "response", s = "lambda.min")

# compare mean absolute error for two solutions
mean(abs(auto$horsepower - fit.1se))
mean(abs(auto$horsepower - fit.min))



######***######   family = "Gamma"   ######***######

# load data
data(auto)

# 10-fold cv (formula method, response = origin)
set.seed(1)
mod <- cv.grpnet(mpg ~ ., data = auto, family = "Gamma")

# get fitted values at "lambda.1se"
fit.1se <- predict(mod, newdata = auto, type = "response")

# get fitted values at "lambda.min"
fit.min <- predict(mod, newdata = auto, type = "response", s = "lambda.min")

# compare mean absolute error for two solutions
mean(abs(auto$mpg - fit.1se))
mean(abs(auto$mpg - fit.min))



######***######   family = "inverse.gaussian"   ######***######

# load data
data(auto)

# 10-fold cv (formula method, response = origin)
set.seed(1)
mod <- cv.grpnet(mpg ~ ., data = auto, family = "inverse.gaussian")

# get fitted values at "lambda.1se"
fit.1se <- predict(mod, newdata = auto, type = "response")

# get fitted values at "lambda.min"
fit.min <- predict(mod, newdata = auto, type = "response", s = "lambda.min")

# compare mean absolute error for two solutions
mean(abs(auto$mpg - fit.1se))
mean(abs(auto$mpg - fit.min))


grpnet documentation built on Oct. 12, 2024, 1:07 a.m.