admm_lasso: Fitting A Lasso Model Using ADMM Algorithm
In yixuan/ADMM: Solving Statistical Optimization Problems Using the ADMM Algorithm

Description Usage Arguments Setting Penalty Parameter Parallel Computing Additional Options Model Fitting Author(s) Examples

View source: R/30_admm_lasso.R

Lasso is a popular variable selection technique in high dimensional regression analysis, which tries to find the coefficient vector β that minimizes

1/(2n) * ||y - X * β||_2^2 + λ * ||β||_1

Here n is the sample size and λ is a regularization parameter that controls the sparseness of β.

This function will not directly conduct the computation, but rather returns an object of class "ADMM_Lasso" that contains several memeber functions to actually constructs and fits the model.

Member functions that are callable from this object are listed below:

`$penalty()`	Specify the penalty parameter. See section Setting Penalty Parameter for details.
`$parallel()`	Specify the number of threads for parallel computing. See section Parallel Computing for details.
`$opts()`	Setting additional options. See section Additional Options for details.
`$fit()`	Fit the model and do the actual computation. See section Model Fitting for details.

1	admm_lasso(x, y, intercept = TRUE, standardize = TRUE, ...)

`x`	The data matrix
`y`	The response vector
`intercept`	Whether to fit an intercept in the model. Default is `TRUE`.
`standardize`	Whether to standardize the explanatory variables before fitting the model. Default is `TRUE`. Fitted coefficients are always returned on the original scale.

The penalty parameter λ can be set through the member function $penalty(), with the usage and parameters given below:

1	model$penalty(lambda = NULL, nlambda = 100, lambda_min_ratio, ...)

lambda: A user provided sequence of λ. If set to NULL, the program will calculate its own sequence according to nlambda and lambda_min_ratio, which starts from λ_0 (with this λ all coefficients will be zero) and ends at lambda0 * lambda_min_ratio, containing nlambda values equally spaced in the log scale. It is recommended to set this parameter to be NULL (the default).
nlambda: Number of values in the λ sequence. Only used when the program calculates its own λ (by setting lambda = NULL).
lambda_min_ratio: Smallest value in the λ sequence as a fraction of λ_0. See the explanation of the lambda argument. This parameter is only used when the program calculates its own λ (by setting lambda = NULL). The default value is the same as glmnet: 0.0001 if nrow(x) >= ncol(x) and 0.01 otherwise.

This member function will implicitly return the "ADMM_Lasso" object itself.

The Lasso model can be fitted with parallel computing by setting the number of threads in the $parallel() member function. The usage of this method is

1	model$parallel(nthread = 2, ...)

Here model is the object returned by admm_lasso(), and nthread is the number of threads to be used. nthread must be less than ncol(x) / 5.

NOTE: Even in serial version of admm_lasso(), most matrix operations are implicitly parallelized when proper compiler options are turned on. Hence the parallel version of admm_lasso() is not necessarily faster than the serial one.

This member function will implicitly return the "ADMM_Lasso" object itself.

Additional options related to ADMM algorithm can be set through the $opts() member function of an "ADMM_Lasso" object. The usage of this method is

1 2	model$opts(maxit = 10000, eps_abs = 1e-5, eps_rel = 1e-5, rho = NULL)

Here model is the object returned by admm_lasso(). Explanation of the arguments is given below:

maxit: Maximum number of iterations.
eps_abs: Absolute tolerance parameter.
eps_rel: Relative tolerance parameter.
rho: ADMM step size parameter. If set to NULL, the program will compute a default one.

This member function will implicitly return the "ADMM_Lasso" object itself.

Model will be fit after calling the $fit() member function. This is no argument that needs to be set. The function will return an object of class "ADMM_Lasso_fit", which contains the following fields:

lambda: The sequence of λ to build the solution path.
beta: A sparse matrix containing the estimated coefficient vectors, each column for one λ. Intercepts are in the first row.
niter: Number of ADMM iterations.

Class "ADMM_Lasso_fit" also contains a $plot() member function, which plots the coefficient paths with the sequence of λ. See the examples below.

Yixuan Qiu <http://statr.me>

set.seed(123)
n = 100
p = 20
b = runif(p)
x = matrix(rnorm(n * p, mean = 1.2, sd = 2), n, p)
y = 5 + c(x %*% b) + rnorm(n)

## Directly fit the model
admm_lasso(x, y)$fit()

## Or, if you want to have more customization:
model = admm_lasso(x, y)
print(model)

## Specify the lambda sequence
model$penalty(nlambda = 20, lambda_min_ratio = 0.01)

## Lower down precision for faster computation
model$opts(maxit = 100, eps_rel = 0.001)

## Use parallel computing (not necessary for this small dataset here)
# model$parallel(nthread = 2)

## Inspect the updated model setting
print(model)

## Fit the model and do the actual computation
res = model$fit()
res$beta

## Create a solution path plot
res$plot()