VARselect: VARselect

Description Usage Arguments Details Value Note Author(s) References Examples

Description

Estimation in the regression model : Y= X β + σ N(0,1)
Variable selection by choosing the best predictor among predictors emanating
from different methods as lasso, elastic-net, adaptive lasso, pls, randomForest.

Usage

1
2
3
4
5
6
7
8
9
VARselect(Y, X, dmax = NULL, normalize = TRUE, method = c("lasso", 
    "ridge", "pls", "en", "ALridge", "ALpls", "rF", "exhaustive"), 
    pen.crit = NULL, lasso.dmax = NULL, ridge.dmax = NULL, pls.dmax = NULL, 
    en.dmax = NULL, ALridge.dmax = NULL, ALpls.dmax = NULL, rF.dmax = NULL, 
    exhaustive.maxdim = 5e+05, exhaustive.dmax = NULL, en.lambda = c(0.01, 
        0.1, 0.5, 1, 2, 5), ridge.lambda = c(0.01, 0.1, 0.5, 
        1, 2, 5), rF.lmtry = 2, pls.ncomp = 5, ALridge.lambda = c(0.01, 
        0.1, 0.5, 1, 2, 5), ALpls.ncomp = 5, max.steps = NULL, 
    K = 1.1, verbose = TRUE, long.output = FALSE)

Arguments

Y

vector with n components : response variable.

X

matrix with n rows and p columns : covariates.

dmax

integer : maximum number of variables in the lasso estimator. dmax D where
D = min (3*p/4 , n-5) if pn
D= min(p,n-5) if p < n.
Default : dmax = D.

normalize

logical : if TRUE the columns of X are scaled

method

vector of characters whose components are subset of
“lasso”, “ridge”, “pls”, “en”, “ALridge”, “ALpls”, “rF”, “exhaustive”.

pen.crit

vector with dmax+1 components : for d=0, ..., dmax, penalty[d+1] gives the value of the penalty for the dimension d. Default : penalty = NULL. In that case, the penalty will be calculated by the function penalty.

lasso.dmax

integer lower than dmax, default = dmax.

ridge.dmax

integer lower than dmax, default = dmax.

pls.dmax

integer lower than dmax, default = dmax.

en.dmax

integer lower than dmax, default = dmax.

ALridge.dmax

integer lower than dmax, default = dmax.

ALpls.dmax

integer lower than dmax, default = dmax.

rF.dmax

integer lower than dmax, default = dmax.

exhaustive.maxdim

integer : maximum number of subsets of covariates considered in the exhaustive method. See details.

exhaustive.dmax

integer lower than dmax, default = dmax

en.lambda

vector : tuning parameter of the ridge. It is the input parameter lambda of function enet

ridge.lambda

vector : tuning parameter of the ridge. It is the input parameter lambda of function lm.ridge

rF.lmtry

vector : tuning paramer mtry of function randomForest, mtry =p/rF.lmtry.

pls.ncomp

integer : tuning parameter of the pls. It is the input parameter ncomp of the function plsr. See details.

ALridge.lambda

similar to ridge.lambda in the adaptive lasso procedure.

ALpls.ncomp

similar to pls.ncomp in the adaptive lasso procedure. See details.

max.steps

integer. Maximum number of steps in the lasso procedure. Corresponds to the input max.steps of the function enet.
Default : max.steps = 2*min(p,n)

K

scalar : value of the parameter K in the LINselect criteria.

verbose

logical : if TRUE a trace of the current process is displayed in real time.

long.output

logical : if FALSE only the component summary will be returned. See Value.

Details

When method is pls or ALpls, the LINselect procedure is carried out considering the number of components in the pls method as the tuning parameter.
This tuning parameter varies from 1 to pls.ncomp.

When method is exhaustive, the maximum number of variate d is calculated as follows.
Let q be the largest integer such that choose(p,q) < exhaustive.maxdim. Then d = min(q, exhaustive.dmax,dmax).

Value

A list with at least length(method) components.
For each procedure in method a list with components

If length(method) > 1, the additional component summary is a list with three components:

If pen.crit = NULL, the component pen.crit gives the values of the penalty calculated by the function penalty. If long.output is TRUE the component named chatty is a list with length(method) components.
For each procedure in method, a list with components

Note

When method is lasso, library elasticnet is loaded.

When method is en, library elasticnet is loaded.

When method is ridge, library MASS is loaded.

When method is rF, library randomForest is loaded.

When method is pls, library pls is loaded.

When method is ALridge, libraries MASS and elasticnet are loaded.

When method is ALpls, libraries pls and elasticnet are loaded.

When method is exhaustive, library gtools is loaded.

Author(s)

Yannick Baraud, Christophe Giraud, Sylvie Huet

References

See Baraud et al. 2010 http://hal.archives-ouvertes.fr/hal-00502156/fr/
Giraud et al., 2013, http://projecteuclid.org/DPubS?service=UI&version=1.0&verb=Display&handle=euclid.ss/1356098553

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
#source("charge.R")
library("LINselect")

# simulate data with
# beta=c(rep(2.5,5),rep(1.5,5),rep(0.5,5),rep(0,p-15))
ex <- simulData(p=100,n=100,r=0.8,rSN=5)

## Not run: ex1.VARselect <- VARselect(ex$Y,ex$X,exhaustive.dmax=2)

## Not run: data(diabetes)
## Not run: attach(diabetes)
## Not run: ex.diab <- VARselect(y,x2,exhaustive.dmax=5)
## Not run: detach(diabetes)

LINselect documentation built on Jan. 10, 2020, 9:08 a.m.