lasso_cv: wrap function for 'cv.glmnet'

View source: R/lasso_cv.R

lasso_cvR Documentation

wrap function for cv.glmnet

Description

Fit a first cross-validation on lasso regression and return selected covariates. Can deal with very large sparse data matrices. Intended for binary reponse only (option family = "binomial" is forced). Depends on the cv.glmnet function from the package glmnet.

Usage

lasso_cv(x, y, nfolds = 5, foldid = NULL, betaPos = TRUE, ...)

Arguments

x

Input matrix, of dimension nobs x nvars. Each row is an observation vector. Can be in sparse matrix format (inherit from class "sparseMatrix" as in package Matrix).

y

Binary response variable, numeric.

nfolds

Number of folds - default is 5. Although nfolds can be as large as the sample size (leave-one-out CV), it is not recommended for large datasets. Smallest value allowable is nfolds=3.

foldid

An optional vector of values between 1 and nfolds identifying what fold each observation is in. If supplied, nfolds can be missing.

betaPos

Should the covariates selected by the procedure be positively associated with the outcome ? Default is TRUE.

...

Other arguments that can be passed to cv.glmnet from package glmnet other than nfolds, foldid, and family.

Value

An object with S3 class "log.lasso".

beta

Numeric vector of regression coefficients in the lasso. In lasso_cv function, the regression coefficients are PENALIZED. Length equal to nvars.

selected_variables

Character vector, names of variable(s) selected with the lasso-cv approach. If betaPos = TRUE, this set is the covariates with a positive regression coefficient in beta. Else this set is the covariates with a non null regression coefficient in beta. Covariates are ordering according to magnitude of their regression coefficients absolute value.

Author(s)

Emeline Courtois
Maintainer: Emeline Courtois emeline.courtois@inserm.fr

Examples


set.seed(15)
drugs <- matrix(rbinom(100*20, 1, 0.2), nrow = 100, ncol = 20)
colnames(drugs) <- paste0("drugs",1:ncol(drugs))
ae <- rbinom(100, 1, 0.3)
lcv <- lasso_cv(x = drugs, y = ae, nfolds = 3)



adapt4pv documentation built on May 31, 2023, 6:08 p.m.