stabilityselection: Score variable importance with stability selection

Description Usage Arguments Value Examples

Description

Score the importance of covariates to predict an output by combining LARS sparse regression with stability selection. Given a matrix of covariates and a vector or outputs to predict, stability selection works by solving repeatedly a sparse regression problem (here we use the LARS method) on a randomly modified covariate matrix (here we subsample the rows and randomly reweight the columns). The stability selection (SS) score then evaluates the importance of each covariates based on how often it is selected. We implement two SS scores, the original one of Meinshausen and Buhlmann which measure of frequency of selection among the top LL covariates, and the area score of Haury and Vert which combines the frequency of selection among the top L covariates for different values of L.

Usage

1
2
stabilityselection(x, y, nsplit = 100, nstepsLARS = 20, alpha = 0.2,
  scoring = "area")

Arguments

x

The input matrix, each row is a sample, each column a feature.

y

A vector of response variable.

nsplit

The number of splits of the samples into two subsamples (default 100)

nstepsLARS

The maximum number of LARS steps performed at each iteration (default 20)

alpha

The random multiplicative weights of each column are uniformly sampled in the interval [alpha,1] (default 0.2)

scoring

How to score a feature. If "area" we compute the area under the stability curve, as proposed by Haury et al. If "max" we just compute the stability curve, as propose by Meinshausen and Buhlmann (default "area")

Value

A matrix of SS scores. Each column corresponds to a covariate. Each row corresponds to a number of LARS steps.

Examples

1
2
3
4
5
6
7
n <- 100
p <- 40
x <- matrix(rnorm(n*p),n,p)
beta <- c(rnorm(5),numeric(p-5))
y <- x%*%beta + rnorm(n)
s <- stabilityselection(x, y, nsplit=500, nstepsLARS=5)
matplot(s, type='b', lwd=2, ylab="SS area score", xlab="LARS steps")

jpvert/tigress documentation built on May 3, 2019, 4:04 p.m.