Cross-validation for hqreg

Share:

Description

Perform k-fold cross validation for elastic-net penalized Huber loss regression and quantile regression over a sequence of lambda values and find an optimal lambda.

Usage

1
2
cv.hqreg(X, y, ..., ncores = 1, nfolds = 10, fold.id, 
         type.measure = c("deviance", "mse", "mae"), seed)

Arguments

X

The input matrix, as in hqreg.

y

The response vector, as in hqreg.

...

Additional arguments to hqreg.

ncores

cv.hqreg can be run in parallel across a cluster using the parallel package. If ncores > 1,a cluster is created to run cv.hqreg in parallel. The code is run sequentially if ncores = 1 (the default). A message is printed if ncores is larger than the total number of available cores, and all available cores will be used.

nfolds

The number of cross-validation folds. Default is 10.

fold.id

(Optional) a vector of values between 1 and nfold indicating which fold each observation belongs to. If supplied, nfolds can be missing. By default the observations are randomly assigned by cv.hqreg.

type.measure

The default is "deviance", which uses the chosen loss function of the model. Other options include "mse" for mean squared error and "mae" for mean absolute error.

seed

(Optional) Seed for the random number generator in order to obtain reproducible results.

Details

The function randomly partitions the data in nfolds. It calls hqreg nfolds+1 times, the first to obtain the lambda sequence, and the remainder to fit with each of the folds left out once for validation. The cross-validation error is the average of validation errors for the nfolds fits.

Note that cv.hqreg does not search for values of alpha, gamma or tau. Specific values should be supplied, otherwise the default ones for hqreg are used. If users would like to cross-validate alpha, gamma or tau as well, they should call cv.hqreg for each combination of these parameters and use the same "seed" in these calls so that the partitioning remains the same.

Value

The function returns an object of S3 class "cv.hqreg", which is a list containing:

cve

The error for each value of lambda, averaged across the cross-validation folds.

cvse

The estimated standard error associated with each value of cve.

type.measure

Same as above.

lambda

The values of lambda used in the cross-validation fits.

fit

The fitted hqreg object for the whole data.

lambda.1se

The largest lambda such that the error is within 1 standard error of the minimum.

lambda.min

The value of lambda with the minimum cross-validation error.

Author(s)

Congrui Yi <congrui-yi@uiowa.edu>

See Also

hqreg, plot.cv.hqreg

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
X = matrix(rnorm(1000*100), 1000, 100)
beta = rnorm(10)
eps = 4*rnorm(1000)
y = drop(X[,1:10] %*% beta + eps)
cv = cv.hqreg(X, y, seed = 123)
plot(cv)
predict(cv, X[1:5,])

# parallel cross validation
## Not run: 
cv = cv.hqreg(X, y, ncores = 5)
plot(cv)

## End(Not run)