rlasso: rlasso: Function for Lasso estimation under homoscedastic and... In hdm: High-Dimensional Metrics

Description

The function estimates the coefficients of a Lasso regression with data-driven penalty under homoscedasticity and heteroscedasticity with non-Gaussian noise and X-dependent or X-independent design. The method of the data-driven penalty can be chosen. The object which is returned is of the S3 class `rlasso`.

Usage

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19``` ```rlasso(x, ...) ## S3 method for class 'formula' rlasso(formula, data = NULL, post = TRUE, intercept = TRUE, model = TRUE, penalty = list(homoscedastic = FALSE, X.dependent.lambda = FALSE, lambda.start = NULL, c = 1.1, gamma = 0.1/log(n)), control = list(numIter = 15, tol = 10^-5, threshold = NULL), ...) ## S3 method for class 'character' rlasso(x, data = NULL, post = TRUE, intercept = TRUE, model = TRUE, penalty = list(homoscedastic = FALSE, X.dependent.lambda = FALSE, lambda.start = NULL, c = 1.1, gamma = 0.1/log(n)), control = list(numIter = 15, tol = 10^-5, threshold = NULL), ...) ## Default S3 method: rlasso(x, y, post = TRUE, intercept = TRUE, model = TRUE, penalty = list(homoscedastic = FALSE, X.dependent.lambda = FALSE, lambda.start = NULL, c = 1.1, gamma = 0.1/log(n)), control = list(numIter = 15, tol = 10^-5, threshold = NULL), ...) ```

Arguments

 `x` regressors (vector, matrix or object can be coerced to matrix) `...` further arguments (only for consistent defintion of methods) `formula` an object of class "formula" (or one that can be coerced to that class): a symbolic description of the model to be fitted in the form `y~x` `data` an optional data frame, list or environment (or object coercible by as.data.frame to a data frame) containing the variables in the model. If not found in data, the variables are taken from environment(formula), typically the environment from which `rlasso` is called. `post` logical. If `TRUE`, post-Lasso estimation is conducted. `intercept` logical. If `TRUE`, intercept is included which is not penalized. `model` logical. If `TRUE` (default), model matrix is returned. `penalty` list with options for the calculation of the penalty. `c` and `gamma` constants for the penalty with default `c=1.1` and `gamma=0.1` `homoscedastic` logical, if homoscedastic errors are considered (default `FALSE`). Option `none` is described below. `X.dependent.lambda` logical, `TRUE`, if the penalization parameter depends on the the design of the matrix `x`. `FALSE`, if independent of the design matrix (default). `numSim` number of simulations for the dependent methods, default=5000 `lambda.start` initial penalization value, compulsory for method "none" `control` list with control values. `numIter` number of iterations for the algorithm for the estimation of the variance and data-driven penalty, ie. loadings, `tol` tolerance for improvement of the estimated variances. `threshold` is applied to the final estimated lasso coefficients. Absolute values below the threshold are set to zero. `y` dependent variable (vector, matrix or object can be coerced to matrix)

Details

The function estimates the coefficients of a Lasso regression with data-driven penalty under homoscedasticity / heteroscedasticity and non-Gaussian noise. The options `homoscedastic` is a logical with `FALSE` by default. Moreover, for the calculation of the penalty parameter it can be chosen, if the penalization parameter depends on the design matrix (`X.dependent.lambda=TRUE`) or `independent` (default, `X.dependent.lambda=FALSE`). The default value of the constant `c` is `1.1` in the post-Lasso case and `0.5` in the Lasso case. A special option is to set `homoscedastic` to `none` and to supply a values `lambda.start`. Then this value is used as penalty parameter with independent design and heteroscedastic errors to weight the regressors. For details of the implementation of the Algorithm for estimation of the data-driven penalty, in particular the regressor-independent loadings, we refer to Appendix A in Belloni et al. (2012). When the option "none" is chosen for `homoscedastic` (together with `lambda.start`), lambda is set to `lambda.start` and the regressor-independent loadings und heteroscedasticity are used. The options "X-dependent" and "X-independent" under homoscedasticity are described in Belloni et al. (2013).

The option `post=TRUE` conducts post-lasso estimation, i.e. a refit of the model with the selected variables.

Value

`rlasso` returns an object of class `rlasso`. An object of class "rlasso" is a list containing at least the following components:

 `coefficients` parameter estimates `beta` parameter estimates (named vector of coefficients without intercept) `intercept` value of the intercept `index` index of selected variables (logical vector) `lambda` data-driven penalty term for each variable, product of lambda0 (the penalization parameter) and the loadings `lambda0` penalty term `loadings` loading for each regressor `residuals` residuals, response minus fitted values `sigma` root of the variance of the residuals `iter` number of iterations `call` function call `options` options `model` model matrix (if `model = TRUE` in function call)

References

A. Belloni, D. Chen, V. Chernozhukov and C. Hansen (2012). Sparse models and methods for optimal instruments with an application to eminent domain. Econometrica 80 (6), 2369-2429.

A. Belloni, V. Chernozhukov and C. Hansen (2013). Inference for high-dimensional sparse econometric models. In Advances in Economics and Econometrics: 10th World Congress, Vol. 3: Econometrics, Cambirdge University Press: Cambridge, 245-295.

Examples

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13``` ```set.seed(1) n = 100 #sample size p = 100 # number of variables s = 3 # nubmer of variables with non-zero coefficients X = Xnames = matrix(rnorm(n*p), ncol=p) colnames(Xnames) <- paste("V", 1:p, sep="") beta = c(rep(5,s), rep(0,p-s)) Y = X%*%beta + rnorm(n) reg.lasso <- rlasso(Y~Xnames) Xnew = matrix(rnorm(n*p), ncol=p) # new X colnames(Xnew) <- paste("V", 1:p, sep="") Ynew = Xnew%*%beta + rnorm(n) #new Y yhat = predict(reg.lasso, newdata = Xnew) ```

Example output

```
```

hdm documentation built on Jan. 24, 2018, 1:02 a.m.