rq.pen: Fit a quantile regression model using a penalized quantile...

View source: R/mainFunctions.R

rq.penR Documentation

Fit a quantile regression model using a penalized quantile loss function.

Description

Let q index the Q quantiles of interest. Let \rho_\tau(a) = a[\tau-I(a<0)]. Fits quantile regression models by minimizing the penalized objective function of

\frac{1}{n} \sum_{q=1}^Q \sum_{i=1}^n \rho_\tau(y_i-x_i^\top\beta^q) + \sum_{q=1}^Q \sum_{j=1}^p P(\beta^q_p,w_q*v_j*\lambda,a).

Where w_q and v_j are designated by penalty.factor and tau.penalty.factor respectively. Value of P() depends on the penalty. See references or vignette for more details,

  • LASSO: P(\beta,\lambda,a)=\lambda|\beta|

  • SCAD: P(\beta,\lambda,a)=SCAD(\beta,\lambda,a)

  • MCP: P(\beta,\lambda,a)=MCP(\beta,\lambda,a)

  • Ridge: P(\beta,\lambda,a)=\lambda\beta^2

  • Elastic Net: P(\beta,\lambda,a)=a*\lambda|\beta|+(1-a)*\lambda*\beta^2

  • Adaptive LASSO: P(\beta,\lambda,a)=\frac{\lambda |\beta|}{|\beta_0|^a}

For Adaptive LASSO the values of \beta_0 come from a Ridge solution with the same value of \lambda. Three different algorithms are implemented

  • huber: Uses a Huber approximation of the quantile loss function. See Yi and Huang 2017 for more details.

  • br: Solution is found by re-formulating the problem so it can be solved with the rq() function from quantreg with the br algorithm.

  • QICD: A coordinate descent algorithm for SCAD and MCP penalties, see Peng and Wang (2015) for details.

The huber algorithm offers substantial speed advantages without much, if any, loss in performance. However, it should be noted that it solves an approximation of the quantile loss function.

Usage

rq.pen(
  x,
  y,
  tau = 0.5,
  lambda = NULL,
  penalty = c("LASSO", "Ridge", "ENet", "aLASSO", "SCAD", "MCP"),
  a = NULL,
  nlambda = 100,
  eps = ifelse(nrow(x) < ncol(x), 0.05, 0.01),
  penalty.factor = rep(1, ncol(x)),
  alg = c("huber", "br", "QICD", "fn"),
  scalex = TRUE,
  tau.penalty.factor = rep(1, length(tau)),
  coef.cutoff = 1e-08,
  max.iter = 10000,
  converge.eps = 1e-07,
  lambda.discard = TRUE,
  weights = NULL,
  ...
)

Arguments

x

matrix of predictors

y

vector of responses

tau

vector of quantiles

lambda

vector of lambda, if not set will be generated automatically

penalty

choice of penalty

a

Additional tuning parameter, not used for lasso or ridge penalties. However, will be set to the elastic net values of 1 and 0 respectively. Defaults are ENet(0), aLASSO(1), SCAD(3.7) and MCP(3).

nlambda

number of lambda, ignored if lambda is set

eps

If not pre-specified the lambda vector will be from lambda_max to lambda_max times eps

penalty.factor

penalty factor for the predictors

alg

Algorithm used.

scalex

Whether x should be scaled before fitting the model. Coefficients are returned on the original scale.

tau.penalty.factor

A penalty factor for each quantile.

coef.cutoff

Some of the linear programs will provide very small, but not sparse solutions. Estimates below this number will be set to zero. This is ignored if a non-linear programming algorithm is used.

max.iter

Maximum number of iterations of non-linear programming algorithms.

converge.eps

Convergence threshold for non-linear programming algorithms.

lambda.discard

Algorithm may stop for small values of lambda if the coefficient estimates are not changing drastically. One example of this is it is possible for the LLA weights of the non-convex functions to all become zero and smaller values of lambda are extremely likely to produce the same zero weights.

weights

Weights for the quantile objective function.

...

Extra parameters.

Value

An rq.pen.seq object.

  • models: A list of each model fit for each tau and a combination.

  • n: Sample size.

  • p: Number of predictors.

  • alg: Algorithm used. Options are "huber", "qicd" or any method implemented in rq(), such as "br".

  • tau: Quantiles modeled.

  • a: Tuning parameters a used.

  • modelsInfo: Information about the quantile and a value for each model.

  • lambda: Lambda values used for all models. If a model has fewer coefficients than lambda, say k. Then it used the first k values of lambda. Setting lambda.discard to TRUE will gurantee all values use the same lambdas, but may increase computational time noticeably and for little gain.

  • penalty: Penalty used.

  • call: Original call.

Each model in the models list has the following values.

  • coefficients: Coefficients for each value of lambda.

  • rho: The unpenalized objective function for each value of lambda.

  • PenRho: The penalized objective function for each value of lambda.

  • nzero: The number of nonzero coefficients for each value of lambda.

  • tau: Quantile of the model.

  • a: Value of a for the penalized loss function.

If the Huber algorithm is used than \rho_\tau(y_i-x_i^\top\beta) is replaced by a Huber-type approximation. Specifically, it is replaced by h^\tau_\gamma(y_i-x_i^\top\beta)/2 where

h^\tau_\gamma(a) = a^2/(2\gamma)I(|a| \leq \gamma) + (|a|-\gamma/2)I(|a|>\gamma)+(2\tau-1)a.

Where if \tau=.5, we get the usual Huber loss function. The Huber implementation calls the package hqreg which implements the methods of Yi and Huang (2017) for Huber loss with elastic net penalties. For non-elastic net penalties the LLA algorithm of Zou and Li (2008) is used to approximate those loss functions with a lasso penalty with different weights for each predictor.

Author(s)

Ben Sherwood, ben.sherwood@ku.edu and Adam Maidman

References

\insertRef

llarqPen

\insertRef

huber_cdrqPen

\insertRef

qr_lassorqPen

\insertRef

qr_cdrqPen

Examples

n <- 200
p <- 8
x <- matrix(runif(n*p),ncol=p)
y <- 1 + x[,1] + x[,8] + (1+.5*x[,3])*rnorm(100)
r1 <- rq.pen(x,y) #Lasso fit for median
# Lasso for multiple quantiles
r2 <- rq.pen(x,y,tau=c(.25,.5,.75))
# Elastic net fit for multiple quantiles, which must use Huber algorithm
r3 <- rq.pen(x,y,penalty="ENet",a=c(0,.5,1),alg="huber")
# First variable is not penalized
r4 <- rq.pen(x,y,penalty.factor=c(0,rep(1,7)))
tvals <- c(.1,.2,.3,.4,.5)
#Similar to penalty proposed by Belloni and Chernouzhukov. 
#To be exact you would divide the tau.penalty.factor by n. 
r5 <- rq.pen(x,y,tau=tvals, tau.penalty.factor=sqrt(tvals*(1-tvals)))

rqPen documentation built on Aug. 25, 2023, 1:07 a.m.