cv.hqreg: Cross-validation for hqreg
In hqreg: Regularization Paths for Lasso or Elastic-Net Penalized Huber Loss Regression and Quantile Regression

View source: R/cv.hqreg.R

cv.hqreg

R Documentation

Cross-validation for hqreg

Description

Perform k-fold cross validation for elastic-net penalized Huber loss regression and quantile regression over a sequence of lambda values and find an optimal lambda.

Usage

cv.hqreg(X, y, ..., FUN = c("hqreg", "hqreg_raw"), ncores = 1, nfolds = 10, fold.id, 
         type.measure = c("deviance", "mse", "mae"), seed)

Arguments

`X`	The input matrix.
`y`	The response vector.
`...`	Additional arguments to `FUN`.
`FUN`	Model fitting function. The default is "hqreg" which preprocesses the data internally. The other option is "hqreg_raw" which uses the raw data as is.
`ncores`	`cv.hqreg` can be run in parallel across a cluster using the `parallel` package. If `ncores > 1`,a cluster is created to run `cv.hqreg` in parallel. The code is run sequentially if `ncores = 1` (the default). A message is printed if `ncores` is larger than the total number of available cores, and all available cores will be used.
`nfolds`	The number of cross-validation folds. Default is 10.
`fold.id`	(Optional) a vector of values between 1 and nfold indicating which fold each observation belongs to. If supplied, nfolds can be missing. By default the observations are randomly assigned by `cv.hqreg`.
`type.measure`	The default is "deviance", which uses the chosen loss function of the model. Other options include "mse" for mean squared error and "mae" for mean absolute error.
`seed`	(Optional) Seed for the random number generator in order to obtain reproducible results.

Details

The function randomly partitions the data in nfolds. It calls hqreg nfolds+1 times, the first to obtain the lambda sequence, and the remainder to fit with each of the folds left out once for validation. The cross-validation error is the average of validation errors for the nfolds fits.

Note that cv.hqreg does not search for values of alpha, gamma or tau. Specific values should be supplied, otherwise the default ones for hqreg are used. If users would like to cross-validate alpha, gamma or tau as well, they should call cv.hqreg for each combination of these parameters and use the same "seed" in these calls so that the partitioning remains the same.

Value

The function returns an object of S3 class "cv.hqreg", which is a list containing:

`cve`	The error for each value of `lambda`, averaged across the cross-validation folds.
`cvse`	The estimated standard error associated with each value of `cve`.
`type.measure`	Same as above.
`lambda`	The values of `lambda` used in the cross-validation fits.
`fit`	The fitted `hqreg` object for the whole data.
`lambda.1se`	The largest `lambda` such that the error is within 1 standard error of the minimum.
`lambda.min`	The value of `lambda` with the minimum cross-validation error.

Author(s)

Congrui Yi <eric.ycr@gmail.com>

References

Yi, C. and Huang, J. (2017) Semismooth Newton Coordinate Descent Algorithm for Elastic-Net Penalized Huber Loss Regression and Quantile Regression, \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1080/10618600.2016.1256816")}
Journal of Computational and Graphical Statistics

Examples

X = matrix(rnorm(1000*100), 1000, 100)
beta = rnorm(10)
eps = 4*rnorm(1000)
y = drop(X[,1:10] %*% beta + eps)
cv = cv.hqreg(X, y, seed = 123)
plot(cv)

cv_raw = cv.hqreg(X, y, FUN = "hqreg_raw", seed = 321)
predict(cv_raw, X[1:5,])

# parallel cross validation
## Not run: 
cv_parallel = cv.hqreg(X, y, ncores = 5)
plot(cv_parallel)

## End(Not run)

hqreg documentation built on Sept. 30, 2024, 9:40 a.m.