random_binom_blasso: Bootstrap Validation for Binomial Random Lasso Regression

View source: R/random_binom_blasso.R

random_binom_blassoR Documentation

Bootstrap Validation for Binomial Random Lasso Regression

Description

This function performs n glmnet::cv.glmnet(family = "binomial") models using a random subset of variables at each loop. Bootstrap validation will be used at each loop.

Usage

random_binom_blasso(
  x,
  y,
  loops = 2,
  random_vars = floor(sqrt(ncol(x))),
  bootstrap = TRUE,
  smote = FALSE,
  perc_over = 2,
  perc_under = 2,
  alpha = 1,
  nfolds = 10,
  seed = 987654321,
  ncores = 2
)

Arguments

x

x matrix as in glmnet.

y

Should be either a numeric factor with two levels.

loops

Number of loops (a glmnet::cv.glmnet model will be performed in each loop).

random_vars

Number of variables randomly sampled as candidates at each loop. Default is sqrt(p) (where p is number of variables in x).

bootstrap

Logical indicating if bootstrap will be performed or not.

smote

Logical. If it's set to TRUE, the Synthetic Minority Over-sampling Technique will be used to reduce random oversampling. See performanceEstimation::smote function.

perc_over

If smote parameter is TRUE. A number that drives the decision of how many extra cases from the minority class are generated (known as over-sampling).

perc_under

If smote parameter is TRUE. A number that drives the decision of how many extra cases from the majority classes are selected for each case generated from the minority class (known as under-sampling).

alpha

The elasticnet mixing parameter, with 0 ≤ alpha ≤ 1. alpha = 1 is the lasso penalty, and alpha = 0 the ridge penalty.

nfolds

number of folds - default is 10. Although nfolds can be as large as the sample size (leave-one-out CV), it is not recommended for large datasets. Smallest value allowable is nfolds=3.

seed

set.seed() that will be used.

ncores

Number of cores. Each loop will run in one core using the foreach package.

Value

A LassoLoop object with the results.

Author(s)

Pol Castellano-Escuder

References

Jerome Friedman, Trevor Hastie, Robert Tibshirani (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1-22. URL http://www.jstatsoft.org/v33/i01/.


pcastellanoescuder/lassoloops documentation built on July 25, 2022, 12:42 p.m.