Perform cross-validation for covariance-regularized regression, aka the Scout.

Description

This function returns cross-validation error rates for a range of lambda1 and lambda2 values, and also makes beautiful CV plots if plot=TRUE.

Usage

1
2
3
cv.scout(x, y, K= 10,
  lam1s=seq(0.001,.2,len=10),lam2s=seq(0.001,.2,len=10),p1=2,p2=1,
  trace = TRUE, plot=TRUE,plotSE=FALSE,rescale=TRUE,...)

Arguments

x

A matrix of predictors, where the rows are the samples and the columns are the predictors

y

A matrix of observations, where length(y) should equal nrow(x)

K

Number of cross-validation folds to be performed; default is 10

lam1s

The (vector of) tuning parameters for regularization of the covariance matrix. Can be NULL if p1=NULL, since then no covariance regularization is taking place. If p1=1 and nrow(x)<ncol(x), then the no value in lam1s should be smaller than 1e-3, because this will cause graphical lasso to take too long. Also, if ncol(x)>500 then we really do not recommend using p1=1, as graphical lasso can be uncomfortably slow.

lam2s

The (vector of) tuning parameters for the $L_1$ regularization of the regression coefficients, using the regularized covariacne matrix. Can be NULL if p2=NULL. (If p2=NULL, then non-zero lam2s have no effect). A value of 0 will result in no regularization.

p1

The $L_p$ penalty for the covariance regularization. Must be one of 1, 2, or NULL. NULL corresponds to no covariance regularization.

p2

The $L_p$ penalty for the estimation of the regression coefficients based on the regularized covariance matrix. Must be one of 1 (for $L_1$ regularization) or NULL (for no regularization).

trace

Print out progress as we go? Default is TRUE.

plot

If TRUE (by default), makes beautiful CV plots.

plotSE

Should those beautiful CV plots also display std error bars for the CV? Default is FALSE

rescale

Scout rescales coefficients, by default, in order to avoid over-shrinkage

...

Additional parameters

Details

Pass in a data matrix x and a vector of outcomes y; it will perform (10-fold) cross-validation over a range of lambda1 and lambda2 values. By default, Scout(2,1) is performed.

Value

folds

The indices of the members of the K test sets are returned.

cv

A matrix of average cross-validation errors is returned.

cv.error

A matrix containing the standard errors of the elements in "cv", the matrix of average cross-validation errors.

bestlam1

Best value of lam1 found via cross-validation.

bestlam2

Best value fo lam2 found via cross-validation.

lam1s

Values of lam1 considered.

lam2s

Values of lam2 considered.

Author(s)

Daniela M. Witten and Robert Tibshirani

References

Witten, DM and Tibshirani, R (2008) Covariance-regularized regression and classification for high-dimensional problems. Journal of the Royal Statistical Society, Series B 71(3): 615-636. <http://www-stat.stanford.edu/~dwitten>

See Also

scout, predict.scoutobject

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
library(lars)
data(diabetes)
attach(diabetes)
par(mfrow=c(2,1))
par(mar=c(2,2,2,2))
## Not run: cv.sc <- cv.scout(x2,y,p1=2,p2=1)
## Not run: print(cv.sc)
## Not run: cv.la <- cv.lars(x2,y)
## Not run: print(c("Lars minimum CV is ", min(cv.la$cv)))
## Not run: print(c("Scout(2,1) minimum CV is ", min(cv.sc$cv)))
detach(diabetes)