cv.relaxnet: Cross-Validation for relaxnet Models
In relaxnet: Relaxation of glmnet models (as in relaxed lasso, Meinshausen 2007)

Description Usage Arguments Details Value Note Author(s) References See Also Examples

Both of these functions will perform v-fold cross-validation to select tuning parameters for relaxnet models. cv.relaxnet will cross-validate on the value of lambda for both the main model and for the relaxed models, a two dimensional cross-validation. cv.alpha.relaxnet will in addition cross-validate on the value of alpha. For each value of alpha, relaxnet is run once on whole data set, then it is run again v times on subsets of the rows of the data.

cv.relaxnet(x, y, family = c("gaussian", "binomial"),
            nlambda = 100,
            alpha = 1,
            relax = TRUE,
            relax.nlambda = 100,
            relax.max.vars = min(nrow(x), ncol(x)) * 0.8,
            lambda = NULL,
            relax.lambda.index = NULL,
            relax.lambda.list = NULL,
            nfolds = 10,
            foldid,
            multicore = FALSE,
            mc.cores,
            mc.seed = 123,
            ...)

cv.alpha.relaxnet(x, y, family = c("gaussian", "binomial"),
                  nlambda = 100,
                  alpha = c(.1, .3, .5, .7, .9),
                  relax = TRUE,
                  relax.nlambda = 100,
                  relax.max.vars = min(nrow(x), ncol(x)) * 0.8,
                  lambda = NULL,
                  relax.lambda.index = NULL,
                  relax.lambda.list = NULL,
                  nfolds = 10,
                  foldid,
                  multicore = FALSE,
                  mc.cores,
                  mc.seed = 123,
                  ...)

`x`	Input matrix, of dimension nobs x nvars; each row is an observation vector. Can be in sparse matrix format (inherit from class `"sparseMatrix"` as in package `Matrix`). Must have unique colnames.
`y`	response variable. Quantitative for `family="gaussian"`. For `family="binomial"` should be either a factor with two levels, or a two-column matrix of counts or proportions.
`family`	Response type (see above).
`nlambda`	The number of `lambda` values - default is 100. Determines how fine the grid of lambda values should be.
`alpha`	Elastic net mixing parameter (see `glmnet`). For `cv.relaxnet`, this should be a single value. For `cv.alpha.relaxnet` it should be a vector of values.
`relax`	Should the model be relaxed. If FALSE, only the main glmnet model is run and no relaxed models are.
`relax.nlambda`	Like nlambda but for secondary (relaxed) models.
`relax.max.vars`	Maximum number of variables for relaxed models. No relaxation will be done for subsets along the regularization path with number of variables greater than relax.max.vars. If `ncol(x)` > `nrow(x)` and `alpha` < 1, it may make sense to use a value > `nrow(x)`, but this may lead to increased computation time.
`lambda`	See (see `glmnet`). Optional: default is to let `glmnet` choose its own sequence.
`relax.lambda.index`	Vector which indexes the lambda argument and specifyies the values at which a relaxed model should be fit. Optional: default is to let `relaxnet` determine these values based on the `beta` matrix from the main `glmnet` fit. Ignored if `lambda` argument is `NULL`.
`relax.lambda.list`	List of lambda values to use for the relaxed models. Optional: default is to let `relaxnet` determine these values. Ignored if lambda argument is `NULL`.
`nfolds`	Number of folds - default is 10. Although `nfolds` can be as large as the sample size (leave-one-out CV), it is not recommended for large datasets. Smallest value allowable is `nfolds=3`.
`foldid`	An optional vector of values between 1 and `nfold` identifying what fold each observation is in. If supplied, `nfolds` can be missing.
`multicore`	Should execution be parallelized over cv folds (for `cv.relaxnet`) or over alpha values (for `cv.alpha.relaxnet`) using multicore functionality from R's parallel package?
`mc.cores`	Number of cores/cpus to be used for multicore processing. Processing will be most efficient if nfolds (for `cv.relaxnet`) or the length of alpha (for `cv.alpha.relaxnet`) is a multiple of mc.cores. Ignored if `multicore` is `FALSE`
`mc.seed`	Integer value with which to seed the RNG when using parallel processing (internally, `RNGkind` will be called to set the RNG to `"L'Ecuyer-CMRG"`). Will be ignored if `multicore` is `FALSE`. If `mulicore` is `FALSE`, one should be able to get reprodicible results by setting the seed normally (with `set.seed`) prior to running.
`...`	Further aruments passed to glmnet. Use with caution as this has not yet been tested. For example, setting `standardize = FALSE` will probably work correctly, but setting an offset probably won't.

cv.glmnet's type.measure argument has not yet been implemented. For type = gaussian models, mean squared error is used, and for type = binomial, binomial deviance is used.

For cv.relaxnet – an object of class "cv.relaxnet" containing the following slots:

`call`	A copy of the call which produced this object
`relax`	The value of the relax argument. If this is `FALSE`, then several of the other elements of this result will be set to `NA`.
`lambda`	lambda sequence used for this fit
`cvm`	The mean cross-validated error - a vector of length `length(lambda)`. For main model.
`cvsd`	estimate of standard error of `cvm` for main model.
`cvup`	upper curve = `cvm+cvsd` for main model.
`cvlo`	lower curve = `cvm-cvsd` for main model.
`nzero`	number of non-zero coefficients at each `lambda` for main model.
`name`	a text string indicating type of measure.
`relaxnet.fit`	Fitted relaxnet object for the full data.
`relax.cvstuff.list`	List containing cvm and cvsd for each of the relaxed models.
`relax.lambda.list.trunc`	List containing the values of lambda used for cross-validation for each relaxed model
`which.model.min`	This will have value `"main"` if the main model "won" the cross-validation, and if not, it will be an integer specifying which relaxed model won (i.e. which element of relaxnet.fit$relax.glmnet.fits).
`overall.lambda.min`	The value of lambda with overall min cvm (i.e. from the submodel specified by `which.model.min`).
`min.cvm`	The overall minimum value of cvm
`main.lambda.min`	lambda.min, restricted to main model only.
`main.lambda.1se`	lambda.1se, restricted to main model only (see `cv.glmnet`).
`main.min.cvm`	Minimum of cvm, restricted to main model only.
`total.time`	Time in seconds to fit this object (full data fit plus all cross-validation)
`full.data.fit.time`	Time in seconds to fit the relaxnet.fit.
`cv.fit.times`	Time in seconds to fit the models for each set of cv folds.

For cv.alpha.relaxnet – an object of class "cv.alpha.relaxnet" containing the following slots:

`call`	A copy of the call which produced this object
`relax`	The value of the relax argument. If this is `FALSE`, then several of the other elements of this result will be set to `NA`.
`alpha`	The alpha values used.
`cv.relaxnet.results`	List of cv.relaxnet objects, one for each alpha value.
`which.alpha.min`	The alpha value which "won" the cross-validation.
`total.time`	Time in seconds to fit this object.

This is a preliminary release and several additional features are planned for later versions.

Stephan Ritter, with design contributions from Alan Hubbard.

Much of the code (and some help file content) is adapted from the glmnet package, whose authors are Jerome Friedman, Trevor Hastie and Rob Tibshirani.

Stephan Ritter and Alan Hubbard, Tech report (forthcoming).

Jerome Friedman, Trevor Hastie, Rob Tibshirani (2010) “Regularization Paths for Generalized Linear Models via Coordinate Descent.” Journal of Statistical Software 33(1)

Nicolai Meinshausen (2007) “Relaxed Lasso” Computational Statistics and Data Analysis 52(1), 374-393

relaxnet, predict.cv.relaxnet

## generate predictor matrix

nobs <- 100
nvars <- 200

set.seed(23)

x <- matrix(rnorm(nobs * nvars), nobs, nvars)

## make sure it has unique colnames

colnames(x) <- paste("x", 1:ncol(x), sep = "")

## let y depend on first 5 columns plus noise

y <- rowSums(x[, 1:5]) + rnorm(nrow(x))

## run cv.relaxnet

cv.result <- cv.relaxnet(x, y)

predict(cv.result, type = "nonzero")

## very few false positives compared to glmnet alone

## glmnet min rule

predict(cv.result$relaxnet.fit$main.glmnet.fit,
        type = "nonzero",
        s = cv.result$main.lambda.min)

## glmnet 1se rule

predict(cv.result$relaxnet.fit$main.glmnet.fit,
        type = "nonzero",
        s = cv.result$main.lambda.1se)

## get values of the coefs for cv.relaxnet's chosen fit

coefs <- drop(predict(cv.result, type = "coef"))

coefs[coefs != 0]