Description Usage Arguments Value References See Also Examples
RaSE
is a general framework for variable screening. In RaSE screening, to select each of the B1 subspaces, B2 random subspaces are generated and the optimal one is chosen according to some criterion. Then the selected proportions (equivalently, percentages) of variables in the B1 subspaces are used as importance measure to rank these variables.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 
xtrain 
n * p observation matrix. n observations, p features. 
ytrain 
n 0/1 observatons. 
xval 
observation matrix for validation. Default = 
yval 
0/1 observation for validation. Default = 
B1 
the number of weak learners. Default = 200. 
B2 
the number of subspace candidates generated for each weak learner. Default = 
D 
the maximal subspace size when generating random subspaces. Default = 
dist 
the distribution for features when generating random subspaces. Default = 
model 
the model to use. Default = 'lda' when

criterion 
the criterion to choose the best subspace. Default = 'ric' when

k 
the number of nearest neightbors considered when 
cores 
the number of cores used for parallel computing. Default = 1. 
seed 
the random seed assigned at the start of the algorithm, which can be a real number or 
iteration 
the number of iterations. Default = 0. 
cv 
the number of crossvalidations used. Default = 5. Only useful when 
scale 
whether to normalize the data. Logistic, default = FALSE. 
C0 
a positive constant used when 
kl.k 
the number of nearest neighbors used to estimate RIC in a nonparametric way. Default = 
classification 
the indicator of the problem type, which can be TRUE, FALSE or 
... 
additional arguments. 
A list including the following items.
model 
the model used in RaSE screening. 
criterion 
the criterion to choose the best subspace for each weak learner. 
B1 
the number of selected subspaces. 
B2 
the number of subspace candidates generated for each of B1 subspaces. 
n 
the sample size. 
p 
the dimension of data. 
D 
the maximal subspace size when generating random subspaces. 
iteration 
the number of iterations. 
selected.perc 
A list of length ( 
scale 
a list of scaling parameters, including the scaling center and the scale parameter for each feature. Equals to 
Tian, Y. and Feng, Y., 2021(a). RaSE: A variable screening framework via random subspace ensembles. Journal of the American Statistical Association, (justaccepted), pp.130.
Tian, Y. and Feng, Y., 2021(b). RaSE: Random subspace ensemble classification. Journal of Machine Learning Research, 22(45), pp.193.
Chen, J. and Chen, Z., 2008. Extended Bayesian information criteria for model selection with large model spaces. Biometrika, 95(3), pp.759771.
Chen, J. and Chen, Z., 2012. Extended BIC for smallnlargeP sparse GLM. Statistica Sinica, pp.555574.
Schwarz, G., 1978. Estimating the dimension of a model. The annals of statistics, 6(2), pp.461464.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42  set.seed(0, kind = "L'EcuyerCMRG")
train.data < RaModel("screening", 1, n = 100, p = 100)
xtrain < train.data$x
ytrain < train.data$y
# test RaSE screening with linear regression model and BIC
fit < RaScreen(xtrain, ytrain, B1 = 100, B2 = 50, iteration = 0, model = 'lm',
cores = 2, criterion = 'bic')
# Select D variables
RaRank(fit, selected.num = "D")
## Not run:
# test RaSE screening with knn model and 5fold crossvalidation MSE
fit < RaScreen(xtrain, ytrain, B1 = 100, B2 = 50, iteration = 0, model = 'knn',
cores = 2, criterion = 'cv', cv = 5)
# Select n/logn variables
RaRank(fit, selected.num = "n/logn")
# test RaSE screening with SVM and 5fold crossvalidation MSE
fit < RaScreen(xtrain, ytrain, B1 = 100, B2 = 50, iteration = 0, model = 'svm',
cores = 2, criterion = 'cv', cv = 5)
# Select n/logn variables
RaRank(fit, selected.num = "n/logn")
# test RaSE screening with logistic regression model and eBIC (gam = 0.5). Set iteration number = 1
train.data < RaModel("screening", 6, n = 100, p = 100)
xtrain < train.data$x
ytrain < train.data$y
fit < RaScreen(xtrain, ytrain, B1 = 100, B2 = 100, iteration = 1, model = 'logistic',
cores = 2, criterion = 'ebic', gam = 0.5)
# Select n/logn variables from the selected percentage after one iteration round
RaRank(fit, selected.num = "n/logn", iteration = 1)
## End(Not run)

Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.