cVIP: L1 Lasso Selection From Dense Predictor Space.

View source: R/cVIP.R

cVIPR Documentation

L1 Lasso Selection From Dense Predictor Space.

Description

L1 Lasso Selection From Dense Predictor Space.

Usage

cVIP(
  df,
  target_column,
  feature_columns,
  column_proportion,
  record_proportion = 0.05,
  n_iterations,
  l1_lambda,
  glmnet_family
)

Arguments

df

data.table-type data matrix

target_column

Character string identifying the target column

feature_columns

Character vector identifying the feature columns to investigate

column_proportion

Numeric on the interval (0,1), this is the percentage of columns randomly sampeled without replacement.

record_proportion

Numeric on the interval (0,1], default is 1. This is the percentage of records to random select for boot strap sampling, with replacement.

n_iterations

The number of bootstrap replications

l1_lambda

The LASSO L1 penalty parameter

glmnet_family

Character string, see glmnet for more information

Details

cVIP is a parallelized Bootstrap LASSO built on the glmnet package from Hastie et. al. The underlying glmnet framework does not appear to be compatiable with data.table, as such, data.table-type inputs will be down-cast to base::data.matrix-types when needed.

Value

data.table-type object of the (Conditional) Variable Inclusion Probability

Author(s)

James Patrick Horine

References

  • Bach,Francis. (2008) "Bolasso: model consistent Lasso estimation through the bootstrap". URL https://arxiv.org/abs/0804.1302

  • Bunea, Florentina et al. “Penalized least squares regression methods and applications to neuroimaging” NeuroImage vol. 55,4 (2010): 1519-27.

  • Abram, Samantha V et al. “Bootstrap Enhanced Penalized Regression for Variable Selection with Neuroimaging Data” Frontiers in neuroscience vol. 10 344. 28 Jul. 2016, doi:10.3389/fnins.2016.00344

  • Santosa, Fadil; Symes, William W. (1986). "Linear inversion of band-limited reflection seismograms". SIAM Journal on Scientific and Statistical Computing. SIAM. 7 (4): 1307–1330. doi:10.1137/0907087

  • Tibshirani, Robert (1996). "Regression Shrinkage and Selection via the lasso". Journal of the Royal Statistical Society. Series B (methodological). Wiley. 58 (1): 267–88. JSTOR 2346178

  • Jerome Friedman, Trevor Hastie, Robert Tibshirani (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1-22. URL http://www.jstatsoft.org/v33/i01/.

  • Knockoff Filtering as presented in Data Science and Predictive Analytics (UMich HS650). http://www.socr.umich.edu/people/dinov/courses/DSPA_notes/17_RegularizedLinModel_KnockoffFilter.html#10_knockoff_filtering

  • Knockoff Filtering and Bootstrapped LASSO as presented in UMich H650 Notes. http://www.socr.umich.edu/people/dinov/courses/DSPA_notes/17_RegularizedLinModel_KnockoffFilter.R

  • Dinov, ID. (2018) Data Science and Predictive Analytics: Biomedical and Health Applications using R, Springer (ISBN 978-3-319-72346-4)


jameshorine/fastFeatures documentation built on Feb. 3, 2023, 1:30 p.m.