Bolasso: Bootstrapped Lasso

Share:

Description

Perform a bootstrapped Lasso on some random subsamplings of the input data

Usage

1
bolasso(data,Y,mu,m,probaseuil,penalty.factor,random)

Arguments

data

Input matrix of dimension n * p; each of the n rows is an observation vector of p variables. The intercept should be included in the first column as (1,...,1). If not, it is added.

Y

Response variable of length n.

mu

Positive regularization sequence to be used for the Lasso.

m

Number of bootstrap iteration of the Lasso. Default is m=100.

probaseuil

A frequency threshold for selecting the most stable variables over the m boostrap iteration of the Lasso. Default is 1.

penalty.factor

Separate penalty factors can be applied to each coefficient. This is a number that multiplies lambda to allow differential shrinkage. Can be 0 for some variables, which implies no shrinkage, and that variable is always included in the model. Default is 1 for all variables except the intercept.

random

optionnal parameter, matrix of size n*m. If random is provided, the m bootstrap samples are constructed from its m columns.

Details

The Lasso from the glmnet package is performed with the regularization parameter mu over m bootstrap samples. An appearance frequency is obtained which shows the predictive power of each variable. It is calculated as the number of times a variables has been selected by the Lasso over the m bootstrap iteration.

Value

A 'bolasso' object is returned for which the method plot is available.

data

A list containing:

  • X - The scaled matrix used in the algorithm, the first column being (1,...,1).

  • Y - the input response vector

  • means.X - Vector of means of the input data matrix.

  • sigma.X - Vector of variances of the input data matrix.

ind

Set of selected variables for the regularization mu and the threshold probaseuil.

frequency

Appearance frequency of each variable; number of times each variables is selected over the m bootstrap iterations.

References

Model-consistent sparse estimation through the bootstrap; F. Bach 2009

See Also

plot.bolasso, dyadiqueordre

Examples

1
2
3
4
5
6
7
8
9
## Not run: 
x=matrix(rnorm(100*20),100,20)
beta=c(rep(1,5),rep(0,15))
y=x%*%beta+rnorm(100)

mod=bolasso(x,y,mu=seq(1.5,0.1,-0.1))
mod

## End(Not run)