bolasso: Bolasso: Bootstrapped Lasso

Description Usage Arguments Details Value References See Also Examples

View source: R/bolassofunction.R

Description

Perform a bootstrapped Lasso on some random subsamplings of the input data

Usage

1
bolasso(data,Y,mu,m,probaseuil,penalty.factor,random)

Arguments

data

Input matrix of dimension n * p; each of the n rows is an observation vector of p variables. The intercept should be included in the first column as (1,...,1). If not, it is added.

Y

Response variable of length n.

mu

Positive regularization sequence to be used for the Lasso.

m

Number of bootstrap iteration of the Lasso. Default is m=100.

probaseuil

A frequency threshold for selecting the most stable variables over the m boostrap iteration of the Lasso. Default is 1.

penalty.factor

Separate penalty factors can be applied to each coefficient. This is a number that multiplies lambda to allow differential shrinkage. Can be 0 for some variables, which implies no shrinkage, and that variable is always included in the model. Default is 1 for all variables except the intercept.

random

optionnal parameter, matrix of size n*m. If random is provided, the m bootstrap samples are constructed from its m columns.

Details

The Lasso from the glmnet package is performed with the regularization parameter mu over m bootstrap samples. An appearance frequency is obtained which shows the predictive power of each variable. It is calculated as the number of times a variables has been selected by the Lasso over the m bootstrap iteration.

Value

A 'bolasso' object is returned for which the method plot is available.

data

A list containing:

  • X - The scaled matrix used in the algorithm, the first column being (1,...,1).

  • Y - the input response vector

  • means.X - Vector of means of the input data matrix.

  • sigma.X - Vector of variances of the input data matrix.

ind

Set of selected variables for the regularization mu and the threshold probaseuil.

frequency

Appearance frequency of each variable; number of times each variables is selected over the m bootstrap iterations.

References

Model-consistent sparse estimation through the bootstrap; F. Bach 2009

See Also

plot.bolasso, dyadiqueordre

Examples

1
2
3
4
5
6
7
8
9
## Not run: 
x=matrix(rnorm(100*20),100,20)
beta=c(rep(1,5),rep(0,15))
y=x%*%beta+rnorm(100)

mod=bolasso(x,y,mu=seq(1.5,0.1,-0.1))
mod

## End(Not run)

Example output

Loading required package: glmnet
Loading required package: Matrix
Loading required package: foreach
Loaded glmnet 2.0-12

Loaded mht 3.1.2 
Thanks for using me.
Don't hesitate to contact my maintainer if you have a request or you encountered problems/bugs. 
intercept has been added
bolasso(data = x, Y = y, mu = seq(1.5, 0.1, -0.1))
See object$ind for the selected variables depending on 'mu' and 'probaseuil' 
See object$frequency for the frequency of selection depending on 'mu' 

mht documentation built on May 2, 2019, 11:49 a.m.