robustlm: Robust variable selection with exponential squared loss

Description Usage Arguments Details Value Author(s) References Examples

View source: R/robustlm.R

Description

robustlm carries out robust variable selection with exponential squared loss. A block coordinate gradient descent algorithm is used to minimize the loss function.

Usage

1
robustlm(x, y, gamma = NULL, weight = NULL, intercept = TRUE)

Arguments

x

Input matrix, of dimension nobs * nvars; each row is an observation vector. Should be in matrix format.

y

Response variable. Should be a numerical vector or matrix with a single column.

gamma

Tuning parameter in the loss function, which controls the degree of robustness and efficiency of the regression estimators. The loss function is defined as

1-exp(-t^2/γ).

When gamma is large, the estimators are similar to the least squares estimators in the extreme case. A smaller gamma would limit the influence of an outlier on the estimators, although it could also reduce the sensitivity of the estimators. If gamma=NULL, it is selected by a data-driven procedure that yields both high robustness and high efficiency.

weight

Weight in the penalty. The penalty is given by

n∑_{j=1}^dλ_{n j}|β_{j}|.

weight is a vector consisting of λ_{n j}s. If weight=NULL (by default), it is set to be (log(n))/(n|\tilde{β}_j|), where \tilde{β} is a numeric vector, which is an initial estimator of regression coefficients obtained by an MM procedure. The default value meets a BIC-type criterion (See Details).

intercept

Should intercepts be fitted (TRUE) or set to zero (FALSE)

Details

robustlm solves the following optimization problem to obtain robust estimators of regression coefficients:

argmin_{β} ∑_{i=1}^n(1-exp{-(y_i-x_i^Tβ)^2/γ_n})+n∑_{i=1}^d p_{λ_{nj}}(|β_j|),

where p_{λ_{n j}}(|β_{j}|)=λ_{n j}|β_{j}| is the adaptive LASSO penalty. Block coordinate gradient descent algorithm is used to efficiently solve the optimization problem. The tuning parameter gamma and regularization parameter weight are chosen adaptively by default, while they can be supplied by the user. Specifically, the default weight meets the following BIC-type criterion:

min_{τ_n} ∑_{i=1}^{n}[1-exp {-(Y_i-x_i^T} {β})^{2} / γ_{n}]+n ∑_{j=1}^{d} τ_{n j}|β_j| /|\tilde{β}_{n j}|-∑_{j=1}^{d} \log (0.5 n τ_{n j}) \log (n).

Value

An object with S3 class "robustlm", which is a list with the following components:

beta

The regression coefficients.

alpha

The intercept.

gamma

The tuning parameter used in the loss.

weight

The regularization parameters.

loss

Value of the loss function calculated on the training set.

Author(s)

Borui Tang, Jin Zhu, Xueqin Wang

References

Xueqin Wang, Yunlu Jiang, Mian Huang & Heping Zhang (2013) Robust Variable Selection With Exponential Squared Loss, Journal of the American Statistical Association, 108:502, 632-643, DOI: 10.1080/01621459.2013.766613

Tseng, P., Yun, S. A coordinate gradient descent method for nonsmooth separable minimization. Math. Program. 117, 387-423 (2009). https://doi.org/10.1007/s10107-007-0170-0

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
library(MASS)
N <- 100
p <- 8
rho <- 0.2
mu <- rep(0, p)
Sigma <- rho * outer(rep(1, p), rep(1, p)) + (1 - rho) * diag(p)
ind <- 1:p
beta <- (-1)^ind * exp(-2 * (ind - 1) / 20)
lambda_seq <- seq(0.05, 5, length.out = 100)
X <- mvrnorm(N, mu, Sigma)
Z <- rnorm(N, 0, 1)
k <- sqrt(var(X %*% beta) / (3 * var(Z)))
Y <- X %*% beta + drop(k) * Z
robustlm(X, Y)

robustlm documentation built on March 22, 2021, 5:06 p.m.