screening: An efficient variable screening method

Description Usage Arguments Value References Examples

View source: R/screening.r

Description

This function implements 4 different screening methods (SIS, HOLP, RRCS and Forward regression) for linear models and 3 (excluding RRCS) for generalized linear models.

Usage

1
2
screening(x, y, method = "holp", num.select = floor(dim(x)[1]/2),
  family = "gaussian", ebic = FALSE, ebic.gamma = 1)

Arguments

x

the predictor variables, each row corresponds to an observation. Should be a numeric matrix instead of a data.frame

y

the observation.

method

the screening method to use. Choices are "sis", "holp", "rrcs", "forward". Default to "holp".

num.select

the number of variables to keep after screening. Default to half of the sample size. It will not be used if ebic is set to be TRUE.

family

the model type choices are the same as glmnet. Default to be 'gaussian'.

ebic

Indicate whether the extended BIC should be used to determine the number of variables to keep. If ebic is TRUE, then the algorithm will use ebic to terminate the screening procedure and num.select will be ignored.

ebic.gamma

tunning parameter for ebic (between 0 and 1). Gamma = 0 corresponds to the usual BIC. default to be 1.

Value

a list of two variables "screen" and "method". "screen" contains the index of the selected variables and "method" indicates the method of the screening.

References

Fan, Jianqing, and Jinchi Lv. "Sure independence screening for ultrahigh dimensional feature space." Journal of the Royal Statistical Society: Series B (Statistical Methodology) 70.5 (2008): 849-911. Wang, Xiangyu, and Chenlei Leng. "High-dimensional ordinary least-squares projection for screening variables." arXiv preprint arXiv:1506.01782 (2015). Li, Gaorong, et al. "Robust rank correlation based screening." The Annals of Statistics 40.3 (2012): 1846-1877. Wang, Hansheng. "Forward regression for ultra-high dimensional variable screening." Journal of the American Statistical Association 104.488 (2009): 1512-1524.

Examples

1
2
3
4
There are one unit test function and two integrated test functions. Two integrated function test on linear model and logistic model. User specify the sample size, dimension and the true indexes. The two function generate simulate data and coefficients and print the screening results for all methords.

linearModelTest(n = 50, p = 100, beta.not.null = c(1, 2, 3), num.select = 20)
logisticTest(n = 50, p = 100, beta.not.null = c(1, 2, 3), nums.select = 20)

wwrechard/screening documentation built on May 4, 2019, 12:04 p.m.