rq.pen.cv: Does k-folds cross validation for rq.pen. If multiple values...

View source: R/mainFunctions.R

rq.pen.cvR Documentation

Does k-folds cross validation for rq.pen. If multiple values of a are specified then does a grid based search for best value of \lambda and a.

Description

Does k-folds cross validation for rq.pen. If multiple values of a are specified then does a grid based search for best value of \lambda and a.

Usage

rq.pen.cv(
  x,
  y,
  tau = 0.5,
  lambda = NULL,
  penalty = c("LASSO", "Ridge", "ENet", "aLASSO", "SCAD", "MCP"),
  a = NULL,
  cvFunc = NULL,
  nfolds = 10,
  foldid = NULL,
  nlambda = 100,
  groupError = TRUE,
  cvSummary = mean,
  tauWeights = rep(1, length(tau)),
  printProgress = FALSE,
  weights = NULL,
  ...
)

Arguments

x

Matrix of predictors.

y

Vector of responses.

tau

Quantiles to be modeled.

lambda

Values of \lambda. Default will automatically select the \lambda values.

penalty

Choice of penalty between LASSO, Ridge, Elastic Net (ENet), Adaptive Lasso (aLASSO), SCAD and MCP.

a

Tuning parameter of a. LASSO and Ridge has no second tuning parameter, but for notation is set to 1 or 0 respectively, the values for elastic net. Defaults are Ridge ()

cvFunc

Loss function for cross-validation. Defaults to quantile loss, but user can specify their own function.

nfolds

Number of folds.

foldid

Ids for folds. If set will override nfolds.

nlambda

Number of lambda, ignored if lambda is set.

groupError

If set to false then reported error is the sum of all errors, not the sum of error for each fold.

cvSummary

Function to summarize the errors across the folds, default is mean. User can specify another function, such as median.

tauWeights

Weights for the different tau models.

printProgress

If set to TRUE prints which partition is being worked on.

weights

Weights for the quantile loss objective function.

...

Additional arguments passed to rq.pen()

Details

Two cross validation results are returned. One that considers the best combination of a and lambda for each quantile. The second considers the best combination of the tuning parameters for all quantiles. Let y_{b,i} and x_{b,i} index the observations in fold b. Let \hat{\beta}_{\tau,a,\lambda}^{-b} be the estimator for a given quantile and tuning parameters that did not use the bth fold. Let n_b be the number of observations in fold b. Then the cross validation error for fold b is

\mbox{CV}(b,\tau) = \frac{1}{n_b} \sum_{i=1}^{n_b} \rho_\tau(y_{b,i}-x_{b,i}^\top\hat{\beta}_{\tau,a,\lambda}^{-b}).

Note that \rho_\tau() can be replaced by a different function by setting the cvFunc parameter. The function returns two different cross-validation summaries. The first is btr, by tau results. It provides the values of lambda and a that minimize the average, or whatever function is used for cvSummary, of \mbox{CV}(b). In addition it provides the sparsest solution that is within one standard error of the minimum results.

The other approach is the group tau results, gtr. Consider the case of estimating Q quantiles of \tau_1,\ldots,\tau_Q It returns the values of lambda and a that minimizes the average, or again whatever function is used for cvSummary, of

\sum_{q=1}^Q\mbox{CV}(b,\tau_q).

If only one quantile is modeled then the gtr results can be ignored as they provide the same minimum solution as btr.

Value

An rq.pen.seq.cv object.

cverr:

Matrix of cvSummary function, default is average, cross-validation error for each model, tau and a combination, and lambda.

cvse:

Matrix of the standard error of cverr foreach model, tau and a combination, and lambda.

fit:

The rq.pen.seq object fit to the full data.

btr:

A data.table of the values of a and lambda that are best as determined by the minimum cross validation error and the one standard error rule, which fixes a. In btr the values of lambda and a are selected seperately for each quantile.

gtr:

A data.table for the combination of a and lambda that minimize the cross validation error across all tau.

gcve:

Group, across all quantiles, cross-validation error results for each value of a and lambda.

call:

Original call to the function.

Author(s)

Ben Sherwood, ben.sherwood@ku.edu

Examples

## Not run: 
x <- matrix(runif(800),ncol=8)
y <- 1 + x[,1] + x[,8] + (1+.5*x[,3])*rnorm(100)
r1 <- rq.pen.cv(x,y) #lasso fit for median
# Elastic net fit for multiple values of a and tau
r2 <- rq.pen.cv(x,y,penalty="ENet",a=c(0,.5,1),tau=c(.25,.5,.75)) 
#same as above but more weight given to median when calculating group cross validation error. 
r3 <- rq.pen.cv(x,y,penalty="ENet",a=c(0,.5,1),tau=c(.25,.5,.75),tauWeights=c(.25,.5,.25))
# uses median cross-validation error instead of mean.
r4 <- rq.pen.cv(x,y,cvSummary=median)  
#Cross-validation with no penalty on the first variable.
r5 <- rq.pen.cv(x,y,penalty.factor=c(0,rep(1,7)))

## End(Not run)

bssherwood/rqpen documentation built on April 23, 2024, 9:50 a.m.