fqr: Fast Quantile Regression

Description Usage Arguments Details Examples

View source: R/fqr.R

Description

Fast Quantile Regression

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
fit_fqr(
  X,
  y,
  tau,
  se = T,
  init_beta = rep(0, ncol(X)),
  smoothing_window = .Machine$double.eps,
  maxiter = 100,
  beta_tol = 1e-05,
  check_tol = 1e-05,
  intercept = 1,
  nsubsamples = 100,
  nwarmup_samples = 1000,
  warm_start = 1
)

fqr(
  formula,
  data,
  tau = 0.5,
  se = T,
  smoothing_window = .Machine$double.eps,
  maxiter = 1000,
  beta_tol = 1e-05,
  check_tol = 1e-05,
  nwarmup_samples = pmin(pmax(100, 0.1 * nrow(data)), nrow(data)),
  warm_start = 1,
  nsubsamples = 100
)

Arguments

X

design matrix

y

outcome variable

tau

vector of target quantile(s)

se

whether to calculate standard errors

init_beta

initial betas for gradient descent (optional, default is random normal initial values)

smoothing_window

neighborhood around 0 to smooth with tilted least-squares loss function

maxiter

maximum number of allowed iterations for gradient descent

beta_tol

stopping criterion based on largest value of the gradient

check_tol

stopping criterion based on the change in the value of the check function between iterations

intercept

what column the intercept is, defaults to 1, 0 to indicate no intercept

nsubsamples

number of subsamples to use when calculating standard errors

nwarmup_samples

number of samples to use for warmup regression

warm_start

whether to run initial warmup regression or just default inits for full data gradient descent

formula

Regression formula

data

data to use when fitting regression

Details

This package performs quantile regression by approximating the check loss function with a least-squares loss function in a small neighbor around 0. Since the only point where the check function is not differentiable is at 0, this allows for first-order gradient descent methods to work.

This package uses "accelerated" gradient descent, which moves the future guess at the coefficients not only by a step size * the gradient, but also based on the prior momentum of the changes in the coefficient which leads to faster convergence. Gradient-based methods work at scale (both in terms of observations and dimension), and are much faster than the interior point algorithms in the quantreg package for large problems. Still, they are sometimes less exact for small problems.

The algorithm employs two early stopping rules: the 'check_tol' argument stops based on the scaled change in the check function loss. The 'beta_tol' argument stops based on the largest value of the gradient vector.

Before using the full dataset, the optimizer "warms up" on a random subset of the dataset. nwarmup_samples controls the size of that. 'warm_start' is an integer which controls whether that happens at all (it is _strongly_ recommended).

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
fit <- fqr(area ~ peri, data = rock, tau = c(0.25, 0.5, 0.75))

# print coefficients & SEs
print(fit)

# grab coefficient vector
coef(fit)

# predict values
predict(fit)

# predict values with new data
predict(fit, newdata = head(rock))

be-green/fqr documentation built on Dec. 19, 2021, 7:41 a.m.