robustlm: Robust variable selection with exponential squared loss
In robustlm: Robust Variable Selection with Exponential Squared Loss

Description Usage Arguments Details Value Author(s) References Examples

View source: R/robustlm.R

robustlm carries out robust variable selection with exponential squared loss. A block coordinate gradient descent algorithm is used to minimize the loss function.

1	robustlm(x, y, gamma = NULL, weight = NULL, intercept = TRUE)

`x`	Input matrix, of dimension nobs * nvars; each row is an observation vector. Should be in matrix format.
`y`	Response variable. Should be a numerical vector or matrix with a single column.
`gamma`	Tuning parameter in the loss function, which controls the degree of robustness and efficiency of the regression estimators. The loss function is defined as 1-exp(-t^2/γ). When `gamma` is large, the estimators are similar to the least squares estimators in the extreme case. A smaller `gamma` would limit the influence of an outlier on the estimators, although it could also reduce the sensitivity of the estimators. If `gamma=NULL`, it is selected by a data-driven procedure that yields both high robustness and high efficiency.
`weight`	Weight in the penalty. The penalty is given by n∑_{j=1}^dλ_{n j}\|β_{j}\|. `weight` is a vector consisting of λ_{n j}s. If `weight=NULL` (by default), it is set to be (log(n))/(n\|\tilde{β}_j\|), where \tilde{β} is a numeric vector, which is an initial estimator of regression coefficients obtained by an MM procedure. The default value meets a BIC-type criterion (See Details).
`intercept`	Should intercepts be fitted (TRUE) or set to zero (FALSE)

robustlm solves the following optimization problem to obtain robust estimators of regression coefficients:

argmin_{β} ∑_{i=1}^n(1-exp{-(y_i-x_i^Tβ)^2/γ_n})+n∑_{i=1}^d p_{λ_{nj}}(|β_j|),

where p_{λ_{n j}}(|β_{j}|)=λ_{n j}|β_{j}| is the adaptive LASSO penalty. Block coordinate gradient descent algorithm is used to efficiently solve the optimization problem. The tuning parameter gamma and regularization parameter weight are chosen adaptively by default, while they can be supplied by the user. Specifically, the default weight meets the following BIC-type criterion:

min_{τ_n} ∑_{i=1}^{n}[1-exp {-(Y_i-x_i^T} {β})^{2} / γ_{n}]+n ∑_{j=1}^{d} τ_{n j}|β_j| /|\tilde{β}_{n j}|-∑_{j=1}^{d} \log (0.5 n τ_{n j}) \log (n).

An object with S3 class "robustlm", which is a list with the following components:

`beta`	The regression coefficients.
`alpha`	The intercept.
`gamma`	The tuning parameter used in the loss.
`weight`	The regularization parameters.
`loss`	Value of the loss function calculated on the training set.

Borui Tang, Jin Zhu, Xueqin Wang

Xueqin Wang, Yunlu Jiang, Mian Huang & Heping Zhang (2013) Robust Variable Selection With Exponential Squared Loss, Journal of the American Statistical Association, 108:502, 632-643, DOI: 10.1080/01621459.2013.766613

Tseng, P., Yun, S. A coordinate gradient descent method for nonsmooth separable minimization. Math. Program. 117, 387-423 (2009). https://doi.org/10.1007/s10107-007-0170-0

library(MASS)
N <- 100
p <- 8
rho <- 0.2
mu <- rep(0, p)
Sigma <- rho * outer(rep(1, p), rep(1, p)) + (1 - rho) * diag(p)
ind <- 1:p
beta <- (-1)^ind * exp(-2 * (ind - 1) / 20)
lambda_seq <- seq(0.05, 5, length.out = 100)
X <- mvrnorm(N, mu, Sigma)
Z <- rnorm(N, 0, 1)
k <- sqrt(var(X %*% beta) / (3 * var(Z)))
Y <- X %*% beta + drop(k) * Z
robustlm(X, Y)