Performs repeated variable selection via the lasso on random sample splits.

1 | ```
multisplit(x, y, covar = NULL, B = 50)
``` |

`x` |
The SNP data matrix, of size |

`y` |
The response vector. It can be continuous or discrete. |

`covar` |
NULL or the matrix of covariates one wishes to control for, of
size |

`B` |
The number of random splits. Default value is 50. |

The samples are divided into two random splits of approximately
equal size. The first subsample is used for variable selection, which is
implemented using glmnet. The first `[nobs/6]`

variables
which enter the lasso path are selected. The procedure is repeated `B`

times.

If one or more covariates are specified, these will be added unpenalized to the regression.

A data frame with 2 components. A matrix of size `B x [nobs/2]`

containing the second subsample of each split, and a matrix of size
`B x [nobs/6]`

containing the selected variables in each split.

Meinshausen, N., Meier, L. and Buhlmann, P. (2009), P-values for high-dimensional regression, Journal of the American Statistical Association 104, 1671-1681.

1 2 3 4 5 6 |

Questions? Problems? Suggestions? Tweet to @rdrrHQ or email at ian@mutexlabs.com.

Please suggest features or report bugs with the GitHub issue tracker.

All documentation is copyright its authors; we didn't write any of that.