Description Usage Arguments Details Value References Examples
Performs repeated variable selection via the lasso on random sample splits.
1 | multisplit(x, y, covar = NULL, B = 50)
|
x |
The SNP data matrix, of size |
y |
The response vector. It can be continuous or discrete. |
covar |
NULL or the matrix of covariates one wishes to control for, of
size |
B |
The number of random splits. Default value is 50. |
The samples are divided into two random splits of approximately
equal size. The first subsample is used for variable selection, which is
implemented using glmnet. The first [nobs/6]
variables
which enter the lasso path are selected. The procedure is repeated B
times.
If one or more covariates are specified, these will be added unpenalized to the regression.
A data frame with 2 components. A matrix of size B x [nobs/2]
containing the second subsample of each split, and a matrix of size
B x [nobs/6]
containing the selected variables in each split.
Meinshausen, N., Meier, L. and Buhlmann, P. (2009), P-values for high-dimensional regression, Journal of the American Statistical Association 104, 1671-1681.
1 2 3 4 5 6 |
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.