Adaptive Huber Estimation and Regression
This package implements the Huber-type estimator for mean, covariance matrix, regression and l1-regularized Huber regression (Huber-Lasso). For all these methods, the robustification parameter τ is calibrated via a tuning-free principle.
Specifically, for Huber regression, assume the observed data vectors (Y, X) follow a linear model Y = θ0 + X θ + ε, where Y is an n-dimensional response vector, X is an n × d design matrix, and ε is an n-vector of noise variables whose distributions can be asymmetric and/or heavy-tailed. The package computes the standard Huber's M-estimator when d < n and the Huber-Lasso estimator when d > n. The vector of coefficients θ and the intercept term θ0 are estimated successively via a two-step procedure. See Wang et al., 2021 for more details.
2022-03-04
Version 1.1 is submitted to CRAN.
Install adaHuber
from CRAN
install.packages("adaHuber")
Error: Compilation failed (with messages involving lgfortran, clang, etc.). Solution: This is a compilation error of Rcpp-based source packages. It happens when we recently submit a new version to CRAN, but it usually takes 3-5 days to build the binary package. Please use an older version or patiently wait for 3-5 days and then install the updated version.
Error: unable to load shared object.. Symbol not found: _EXTPTR_PTR. Solution: This issue is common in some specific versions of R
when we load Rcpp-based libraries. It is an error in R caused by a minor change about EXTPTR_PTR
. Upgrading R to 4.0.2 will solve the problem.
There are five functions in this package:
adaHuber.mean
: Adaptive Huber mean estimation.adaHuber.cov
: Adaptive Huber covariance estimation.adaHuber.reg
: Adaptive Huber regression.adaHuber.lasso
: Adaptive Huber-Lasso regression.adaHuber.cv.lasso
: Cross-validated adaptive Huber-Lasso regression.Help on the functions can be accessed by typing ?
, followed by function name at the R command prompt.
For example, ?adaHuber.reg
will present a detailed documentation with inputs, outputs and examples of the function adaHuber.reg
.
First, we present an example of Huber mean estimation. We generate data from a t distribution, which is heavy-tailed. We estimate its mean by the tuning-free Huber mean estimator.
library(adaHuber)
n = 1000
mu = 2
X = rt(n, 2) + mu
fit.mean = adaHuber.mean(X)
fit.mean$mu
Then we present an example of Huber covariance matrix estimation. We generate data from t distribution with df = 3, which is heavy-tailed.
n = 100
p = 5
X = matrix(rt(n * p, 3), n, p)
fit.cov = adaHuber.cov(X)
fit.cov$cov
Next, we present an example of adaptive Huber regression. Here we generate data from a linear model Y = X θ + ε, where ε follows a t distribution, and estimate the intercept and coefficients by tuning-free Huber regression.
n = 200
p = 10
beta = rep(1.5, p + 1)
X = matrix(rnorm(n * p), n, p)
err = rt(n, 2)
Y = cbind(1, X) %*% beta + err
fit.adahuber = adaHuber.reg(X, Y, method = "adaptive")
beta.adahuber = fit.adahuber$coef
Finally, we illustrate the use of l1-regularized Huber regression. Again, we generate data from a linear model Y = X θ + ε, where θ is a high-dimensional vector, and ε is from a t distribution. We estimate the intercept and coefficients by Huber-Lasso regression, where the regularization parameter λ is calibrated by K-fold cross-validation, and the robustification parameter τ is chosen by a tuning-free procedure.
n = 100; p = 200; s = 5
beta = c(rep(1.5, s + 1), rep(0, p - s))
X = matrix(rnorm(n * p), n, p)
err = rt(n, 2)
Y = cbind(rep(1, n), X) %*% beta + err
fit.lasso = adaHuber.cv.lasso(X, Y)
beta.lasso = fit.lasso$coef
GPL-3.0
C++11
Xiaoou Pan xip024@ucsd.edu, Wen-Xin Zhou wez243@ucsd.edu
Xiaoou Pan xip024@ucsd.edu
Eddelbuettel, D. and Francois, R. (2011). Rcpp: Seamless R and C++ integration. J. Stat. Softw. 40 1-18. Paper
Fan, J., Liu, H., Sun, Q. and Zhang, T. (2018). I-LAMM for sparse learning: Simultaneous control of algorithmic complexity and statistical error. Ann. Statist. 46 814–841. Paper
Ke, Y., Minsker, S., Ren, Z., Sun, Q. and Zhou, W.-X. (2019). User-friendly covariance estimation for heavy-tailed distributions. Statis. Sci. 34 454-471. Paper
Pan, X., Sun, Q. and Zhou, W.-X. (2021). Iteratively reweighted l1-penalized robust regression. Electron. J. Stat. 15 3287-3348. Paper
Sun, Q., Zhou, W.-X. and Fan, J. (2020). Adaptive Huber regression. J. Amer. Stat. Assoc. 115 254-265. Paper
Wang, L., Zheng, C., Zhou, W. and Zhou, W.-X. (2021). A new principle for tuning-free Huber regression. Stat. Sinica 31 2153-2177. Paper
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.