cv.ordinis: CV Fitting for A Lasso Model Using the Coordinate Descent...

Description Usage Arguments Examples

Description

Cross validation for linear models with the lasso penalty

where n is the sample size and λ is a tuning parameter that controls the sparsity of β.

Usage

1
2
3
4
cv.ordinis(x, y, lambda = numeric(0), gamma = 3.7,
  type.measure = c("mse", "deviance", "class", "auc", "mae"),
  nfolds = 10, foldid = NULL, grouped = TRUE, keep = FALSE,
  parallel = FALSE, ...)

Arguments

x

The design matrix

y

The response vector

lambda

A user provided sequence of λ. If set to NULL, the program will calculate its own sequence according to nlambda and lambda_min_ratio, which starts from λ_0 (with this λ all coefficients will be zero) and ends at lambda0 * lambda_min_ratio, containing nlambda values equally spaced in the log scale. It is recommended to set this parameter to be NULL (the default).

gamma

bandwidth for MCP/SCAD

type.measure

measure to evaluate for cross-validation. The default is type.measure = "deviance", which uses squared-error for gaussian models (a.k.a type.measure = "mse" there), deviance for logistic regression. type.measure = "class" applies to binomial only. type.measure = "auc" is for two-class logistic regression only. type.measure = "mse" or type.measure = "mae" (mean absolute error) can be used by all models; they measure the deviation from the fitted mean to the response.

nfolds

number of folds for cross-validation. default is 10. 3 is smallest value allowed.

foldid

an optional vector of values between 1 and nfold specifying which fold each observation belongs to.

grouped

Like in glmnet, this is an experimental argument, with default TRUE, and can be ignored by most users. For all models, this refers to computing nfolds separate statistics, and then using their mean and estimated standard error to describe the CV curve. If grouped = FALSE, an error matrix is built up at the observation level from the predictions from the nfold fits, and then summarized (does not apply to type.measure = "auc").

keep

If keep = TRUE, a prevalidated list of arrasy is returned containing fitted values for each observation and each value of lambda for each model. This means these fits are computed with this observation and the rest of its fold omitted. The folid vector is also returned. Default is keep = FALSE

parallel

If TRUE, use parallel foreach to fit each fold. Must register parallel before hand, such as doMC.

...

other parameters to be passed to "ordinis" function

Examples

1
2
3
4
5
6
7
8
9
set.seed(123)
n = 100
p = 1000
b = c(runif(10, min = 0.2, max = 1), rep(0, p - 10))
x = matrix(rnorm(n * p, sd = 3), n, p)
y = drop(x %*% b) + rnorm(n)

## fit lasso model with 100 tuning parameter values
res <- cv.ordinis(x, y)

jaredhuling/ordinis documentation built on May 23, 2019, 4:03 a.m.