admm_lasso: Fitting A Lasso Model Using ADMM Algorithm

Description Usage Arguments Setting Penalty Parameter Parallel Computing Additional Options Model Fitting Author(s) Examples

View source: R/30_admm_lasso.R

Description

Lasso is a popular variable selection technique in high dimensional regression analysis, which tries to find the coefficient vector β that minimizes

1/(2n) * ||y - X * β||_2^2 + λ * ||β||_1

Here n is the sample size and λ is a regularization parameter that controls the sparseness of β.

This function will not directly conduct the computation, but rather returns an object of class "ADMM_Lasso" that contains several memeber functions to actually constructs and fits the model.

Member functions that are callable from this object are listed below:

$penalty() Specify the penalty parameter. See section Setting Penalty Parameter for details.
$parallel() Specify the number of threads for parallel computing. See section Parallel Computing for details.
$opts() Setting additional options. See section Additional Options for details.
$fit() Fit the model and do the actual computation. See section Model Fitting for details.

Usage

1
admm_lasso(x, y, intercept = TRUE, standardize = TRUE, ...)

Arguments

x

The data matrix

y

The response vector

intercept

Whether to fit an intercept in the model. Default is TRUE.

standardize

Whether to standardize the explanatory variables before fitting the model. Default is TRUE. Fitted coefficients are always returned on the original scale.

Setting Penalty Parameter

The penalty parameter λ can be set through the member function $penalty(), with the usage and parameters given below:

1
    model$penalty(lambda = NULL, nlambda = 100, lambda_min_ratio, ...)
lambda

A user provided sequence of λ. If set to NULL, the program will calculate its own sequence according to nlambda and lambda_min_ratio, which starts from λ_0 (with this λ all coefficients will be zero) and ends at lambda0 * lambda_min_ratio, containing nlambda values equally spaced in the log scale. It is recommended to set this parameter to be NULL (the default).

nlambda

Number of values in the λ sequence. Only used when the program calculates its own λ (by setting lambda = NULL).

lambda_min_ratio

Smallest value in the λ sequence as a fraction of λ_0. See the explanation of the lambda argument. This parameter is only used when the program calculates its own λ (by setting lambda = NULL). The default value is the same as glmnet: 0.0001 if nrow(x) >= ncol(x) and 0.01 otherwise.

This member function will implicitly return the "ADMM_Lasso" object itself.

Parallel Computing

The Lasso model can be fitted with parallel computing by setting the number of threads in the $parallel() member function. The usage of this method is

1
    model$parallel(nthread = 2, ...)

Here model is the object returned by admm_lasso(), and nthread is the number of threads to be used. nthread must be less than ncol(x) / 5.

NOTE: Even in serial version of admm_lasso(), most matrix operations are implicitly parallelized when proper compiler options are turned on. Hence the parallel version of admm_lasso() is not necessarily faster than the serial one.

This member function will implicitly return the "ADMM_Lasso" object itself.

Additional Options

Additional options related to ADMM algorithm can be set through the $opts() member function of an "ADMM_Lasso" object. The usage of this method is

1
2
    model$opts(maxit = 10000, eps_abs = 1e-5, eps_rel = 1e-5,
               rho = NULL)

Here model is the object returned by admm_lasso(). Explanation of the arguments is given below:

maxit

Maximum number of iterations.

eps_abs

Absolute tolerance parameter.

eps_rel

Relative tolerance parameter.

rho

ADMM step size parameter. If set to NULL, the program will compute a default one.

This member function will implicitly return the "ADMM_Lasso" object itself.

Model Fitting

Model will be fit after calling the $fit() member function. This is no argument that needs to be set. The function will return an object of class "ADMM_Lasso_fit", which contains the following fields:

lambda

The sequence of λ to build the solution path.

beta

A sparse matrix containing the estimated coefficient vectors, each column for one λ. Intercepts are in the first row.

niter

Number of ADMM iterations.

Class "ADMM_Lasso_fit" also contains a $plot() member function, which plots the coefficient paths with the sequence of λ. See the examples below.

Author(s)

Yixuan Qiu <http://statr.me>

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
set.seed(123)
n = 100
p = 20
b = runif(p)
x = matrix(rnorm(n * p, mean = 1.2, sd = 2), n, p)
y = 5 + c(x %*% b) + rnorm(n)

## Directly fit the model
admm_lasso(x, y)$fit()

## Or, if you want to have more customization:
model = admm_lasso(x, y)
print(model)

## Specify the lambda sequence
model$penalty(nlambda = 20, lambda_min_ratio = 0.01)

## Lower down precision for faster computation
model$opts(maxit = 100, eps_rel = 0.001)

## Use parallel computing (not necessary for this small dataset here)
# model$parallel(nthread = 2)

## Inspect the updated model setting
print(model)

## Fit the model and do the actual computation
res = model$fit()
res$beta

## Create a solution path plot
res$plot()

yixuan/ADMM documentation built on May 4, 2019, 5:28 p.m.