The function estimates the coefficients of a Lasso regression with
datadriven penalty under homoscedasticity and heteroscedasticity with nonGaussian noise and Xdependent or Xindependent design. The
method of the datadriven penalty can be chosen. The object which is
returned is of the S3 class rlasso
.
1 2 3 4 5 6 7 8 9 10 11 12 13  rlasso(x, ...)
## S3 method for class 'formula'
rlasso(formula, data, post = TRUE, intercept = TRUE,
model = TRUE, penalty = list(homoscedastic = FALSE, X.dependent.lambda =
FALSE, lambda.start = NULL, c = 1.1, gamma = 0.1/log(n)),
control = list(numIter = 15, tol = 10^5, threshold = NULL), ...)
## Default S3 method:
rlasso(x, y, post = TRUE, intercept = TRUE,
model = TRUE, penalty = list(homoscedastic = FALSE, X.dependent.lambda =
FALSE, lambda.start = NULL, c = 1.1, gamma = 0.1/log(n)),
control = list(numIter = 15, tol = 10^5, threshold = NULL), ...)

x 
regressors (vector, matrix or object can be coerced to matrix) 
... 
further arguments (only for consistent defintion of methods) 
formula 
an object of class "formula" (or one that can be coerced to
that class): a symbolic description of the model to be fitted in the form

data 
an optional data frame, list or environment (or object coercible
by as.data.frame to a data frame) containing the variables in the model. If
not found in data, the variables are taken from environment(formula),
typically the environment from which 
post 
logical. If 
intercept 
logical. If 
model 
logical. If 
penalty 
list with options for the calculation of the penalty.

control 
list with control values.

y 
dependent variable (vector, matrix or object can be coerced to matrix) 
The function estimates the coefficients of a Lasso regression with
datadriven penalty under homoscedasticity / heteroscedasticity and nonGaussian noise. The options homoscedastic
is a logical with FALSE
by default.
Moreover, for the calculation of the penalty parameter it can be chosen, if the penalization parameter depends on the design matrix (X.dependent.lambda=TRUE
) or independent
(default, X.dependent.lambda=FALSE
).
The default value of the constant c
is 1.1
in the postLasso case and 0.5
in the Lasso case.
A special option is to set homoscedastic
to none
and to supply a values lambda.start
. Then this value is used as penalty parameter with independent design and heteroscedastic errors to weight the regressors.
For details of the
implementation of the Algorithm for estimation of the datadriven penalty,
in particular the regressorindependent loadings, we refer to Appendix A in
Belloni et al. (2012). When the option "none" is chosen for homoscedastic
(together with
lambda.start
), lambda is set to lambda.start
and the
regressorindependent loadings und heteroscedasticity are used. The options "Xdependent" and
"Xindependent" under homoscedasticity are described in Belloni et al. (2013).
The option post=TRUE
conducts postlasso estimation, i.e. a refit of
the model with the selected variables.
rlasso
returns an object of class rlasso
. An object of
class "rlasso" is a list containing at least the following components:
coefficients 
parameter estimates 
beta 
parameter estimates (named vector of coefficients without intercept) 
intercept 
value of the intercept 
index 
index of selected variables (logical vector) 
lambda 
datadriven penalty term for each variable, product of lambda0 (the penalization parameter) and the loadings 
lambda0 
penalty term 
loadings 
loading for each regressor 
residuals 
residuals, response minus fitted values 
sigma 
root of the variance of the residuals 
iter 
number of iterations 
call 
function call 
options 
options 
model 
model matrix (if 
A. Belloni, D. Chen, V. Chernozhukov and C. Hansen (2012). Sparse models and methods for optimal instruments with an application to eminent domain. Econometrica 80 (6), 23692429.
A. Belloni, V. Chernozhukov and C. Hansen (2013). Inference for highdimensional sparse econometric models. In Advances in Economics and Econometrics: 10th World Congress, Vol. 3: Econometrics, Cambirdge University Press: Cambridge, 245295.
1 2 3 4 5 6 7 8 9 10 11 12 13  set.seed(1)
n = 100 #sample size
p = 100 # number of variables
s = 3 # nubmer of variables with nonzero coefficients
X = Xnames = matrix(rnorm(n*p), ncol=p)
colnames(Xnames) < paste("V", 1:p, sep="")
beta = c(rep(5,s), rep(0,ps))
Y = X%*%beta + rnorm(n)
reg.lasso < rlasso(Y~Xnames)
Xnew = matrix(rnorm(n*p), ncol=p) # new X
colnames(Xnew) < paste("V", 1:p, sep="")
Ynew = Xnew%*%beta + rnorm(n) #new Y
yhat = predict(reg.lasso, newdata = Xnew)

Questions? Problems? Suggestions? Tweet to @rdrrHQ or email at ian@mutexlabs.com.
Please suggest features or report bugs with the GitHub issue tracker.
All documentation is copyright its authors; we didn't write any of that.