SIS.selection: Sure Independence Screening

Description Usage Arguments Details Value Author(s) References Examples

Description

SIS has been performed to select relevant gene expression variables. SIS ranks the importance of features according to their magnitude of marginal regression coefficients.

Usage

1
SIS.selection(X,Y, pred, scale = F)

Arguments

X

a data matrix (nxp) of genes. NAs and Inf are not allowed. Each row corresponds to an observation and each column to a gene.

Y

a vector of length n giving the classes of the n observations. The classes must be coded as 1 or 0.

pred

number of relevant variable to select, pred has to be lower than p.

scale

If scale=TRUE, X will be scaled.

Details

Sure Independence Screening (SIS) has been performed to select relevant gene expression variables pred such as pred < p. SIS refers to ranking features according to marginal utility, namely, each feature is used independently as a predictor to decide its usefulness for predicting the response. Precisely SIS ranks the importance of features according to their magnitude of marginal regression coefficients.

Value

Return a matrix (nxpred) with only the pred most relevant gene and all the observations

Author(s)

Caroline Bazzoli, Thomas Bouleau, Sophie Lambert-Lacroix

References

Fan, J. and Lv, J. (2008). Sure independence screening for ultrahigh dimensional feature space. Journal of the Royal Statistical Society, 70, 849-911.

Examples

1
2
3
4
5
data("BreastCancer")
X<-scale(BreastCancer$X)
Y<-BreastCancer$Y

Xsis<-SIS.selection(X,Y,50)

lsplsGlm documentation built on May 2, 2019, 12:36 p.m.