Description Usage Arguments Details Examples
Fast Quantile Regression
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 | fit_fqr(
X,
y,
tau,
se = T,
init_beta = rep(0, ncol(X)),
smoothing_window = .Machine$double.eps,
maxiter = 100,
beta_tol = 1e-05,
check_tol = 1e-05,
intercept = 1,
nsubsamples = 100,
nwarmup_samples = 1000,
warm_start = 1
)
fqr(
formula,
data,
tau = 0.5,
se = T,
smoothing_window = .Machine$double.eps,
maxiter = 1000,
beta_tol = 1e-05,
check_tol = 1e-05,
nwarmup_samples = pmin(pmax(100, 0.1 * nrow(data)), nrow(data)),
warm_start = 1,
nsubsamples = 100
)
|
X |
design matrix |
y |
outcome variable |
tau |
vector of target quantile(s) |
se |
whether to calculate standard errors |
init_beta |
initial betas for gradient descent (optional, default is random normal initial values) |
smoothing_window |
neighborhood around 0 to smooth with tilted least-squares loss function |
maxiter |
maximum number of allowed iterations for gradient descent |
beta_tol |
stopping criterion based on largest value of the gradient |
check_tol |
stopping criterion based on the change in the value of the check function between iterations |
intercept |
what column the intercept is, defaults to 1, 0 to indicate no intercept |
nsubsamples |
number of subsamples to use when calculating standard errors |
nwarmup_samples |
number of samples to use for warmup regression |
warm_start |
whether to run initial warmup regression or just default inits for full data gradient descent |
formula |
Regression formula |
data |
data to use when fitting regression |
This package performs quantile regression by approximating the check loss function with a least-squares loss function in a small neighbor around 0. Since the only point where the check function is not differentiable is at 0, this allows for first-order gradient descent methods to work.
This package uses "accelerated" gradient descent, which moves the future guess at the coefficients not only by a step size * the gradient, but also based on the prior momentum of the changes in the coefficient which leads to faster convergence. Gradient-based methods work at scale (both in terms of observations and dimension), and are much faster than the interior point algorithms in the quantreg package for large problems. Still, they are sometimes less exact for small problems.
The algorithm employs two early stopping rules: the 'check_tol' argument stops based on the scaled change in the check function loss. The 'beta_tol' argument stops based on the largest value of the gradient vector.
Before using the full dataset, the optimizer "warms up" on a random subset of the dataset. nwarmup_samples controls the size of that. 'warm_start' is an integer which controls whether that happens at all (it is _strongly_ recommended).
1 2 3 4 5 6 7 8 9 10 11 12 13 |
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.