cVIP | R Documentation |
L1 Lasso Selection From Dense Predictor Space.
cVIP( df, target_column, feature_columns, column_proportion, record_proportion = 0.05, n_iterations, l1_lambda, glmnet_family )
df |
|
target_column |
Character string identifying the target column |
feature_columns |
Character vector identifying the feature columns to investigate |
column_proportion |
Numeric on the interval (0,1), this is the percentage of columns randomly sampeled without replacement. |
record_proportion |
Numeric on the interval (0,1], default is 1. This is the percentage of records to random select for boot strap sampling, with replacement. |
n_iterations |
The number of bootstrap replications |
l1_lambda |
The LASSO L1 penalty parameter |
glmnet_family |
Character string, see glmnet for more information |
cVIP
is a parallelized Bootstrap LASSO built on the glmnet
package from Hastie et. al. The underlying glmnet framework does not appear to be compatiable with data.table
, as such, data.table
-type inputs will be down-cast to base::data.matrix
-types when needed.
data.table
-type object of the (Conditional) Variable Inclusion Probability
James Patrick Horine
Bach,Francis. (2008) "Bolasso: model consistent Lasso estimation through the bootstrap". URL https://arxiv.org/abs/0804.1302
Bunea, Florentina et al. “Penalized least squares regression methods and applications to neuroimaging” NeuroImage vol. 55,4 (2010): 1519-27.
Abram, Samantha V et al. “Bootstrap Enhanced Penalized Regression for Variable Selection with Neuroimaging Data” Frontiers in neuroscience vol. 10 344. 28 Jul. 2016, doi:10.3389/fnins.2016.00344
Santosa, Fadil; Symes, William W. (1986). "Linear inversion of band-limited reflection seismograms". SIAM Journal on Scientific and Statistical Computing. SIAM. 7 (4): 1307–1330. doi:10.1137/0907087
Tibshirani, Robert (1996). "Regression Shrinkage and Selection via the lasso". Journal of the Royal Statistical Society. Series B (methodological). Wiley. 58 (1): 267–88. JSTOR 2346178
Jerome Friedman, Trevor Hastie, Robert Tibshirani (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1-22. URL http://www.jstatsoft.org/v33/i01/.
Knockoff Filtering as presented in Data Science and Predictive Analytics (UMich HS650). http://www.socr.umich.edu/people/dinov/courses/DSPA_notes/17_RegularizedLinModel_KnockoffFilter.html#10_knockoff_filtering
Knockoff Filtering and Bootstrapped LASSO as presented in UMich H650 Notes. http://www.socr.umich.edu/people/dinov/courses/DSPA_notes/17_RegularizedLinModel_KnockoffFilter.R
Dinov, ID. (2018) Data Science and Predictive Analytics: Biomedical and Health Applications using R, Springer (ISBN 978-3-319-72346-4)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.