# Variable Selection on Random Sample Splits.

### Description

Performs repeated variable selection via the lasso on random sample splits.

### Usage

1 | ```
multisplit(x, y, covar = NULL, B = 50)
``` |

### Arguments

`x` |
The SNP data matrix, of size |

`y` |
The response vector. It can be continuous or discrete. |

`covar` |
NULL or the matrix of covariates one wishes to control for, of
size |

`B` |
The number of random splits. Default value is 50. |

### Details

The samples are divided into two random splits of approximately
equal size. The first subsample is used for variable selection, which is
implemented using glmnet. The first `[nobs/6]`

variables
which enter the lasso path are selected. The procedure is repeated `B`

times.

If one or more covariates are specified, these will be added unpenalized to the regression.

### Value

A data frame with 2 components. A matrix of size `B x [nobs/2]`

containing the second subsample of each split, and a matrix of size
`B x [nobs/6]`

containing the selected variables in each split.

### References

Meinshausen, N., Meier, L. and Buhlmann, P. (2009), P-values for high-dimensional regression, Journal of the American Statistical Association 104, 1671-1681.

### Examples

1 2 3 4 5 6 |