Description Usage Arguments Setting Penalty Parameter Parallel Computing Additional Options Model Fitting Author(s) Examples
View source: R/30_admm_lasso.R
Lasso is a popular variable selection technique in high dimensional regression analysis, which tries to find the coefficient vector β that minimizes
1/(2n) * ||y - X * β||_2^2 + λ * ||β||_1
Here n is the sample size and λ is a regularization parameter that controls the sparseness of β.
This function will not directly conduct the computation,
but rather returns an object of class "ADMM_Lasso
" that contains
several memeber functions to actually constructs and fits the model.
Member functions that are callable from this object are listed below:
$penalty() | Specify the penalty parameter. See section Setting Penalty Parameter for details. |
$parallel() | Specify the number of threads for parallel computing. See section Parallel Computing for details. |
$opts() | Setting additional options. See section Additional Options for details. |
$fit() | Fit the model and do the actual computation. See section Model Fitting for details. |
1 | admm_lasso(x, y, intercept = TRUE, standardize = TRUE, ...)
|
x |
The data matrix |
y |
The response vector |
intercept |
Whether to fit an intercept in the model. Default is |
standardize |
Whether to standardize the explanatory variables before
fitting the model. Default is |
The penalty parameter λ can be set through the member function
$penalty()
, with the usage and parameters given below:
1 |
lambda
A user provided sequence of λ. If set to
NULL
, the program will calculate its own sequence
according to nlambda
and lambda_min_ratio
,
which starts from λ_0 (with this
λ all coefficients will be zero) and ends at
lambda0 * lambda_min_ratio
, containing
nlambda
values equally spaced in the log scale.
It is recommended to set this parameter to be NULL
(the default).
nlambda
Number of values in the λ sequence. Only used
when the program calculates its own λ
(by setting lambda = NULL
).
lambda_min_ratio
Smallest value in the λ sequence
as a fraction of λ_0. See
the explanation of the lambda
argument. This parameter is only used when
the program calculates its own λ
(by setting lambda = NULL
). The default
value is the same as glmnet: 0.0001 if
nrow(x) >= ncol(x)
and 0.01 otherwise.
This member function will implicitly return the "ADMM_Lasso
" object itself.
The Lasso model can be fitted with parallel computing by setting the number
of threads in the $parallel()
member function. The usage of this method
is
1 |
Here model
is the object returned by admm_lasso()
, and
nthread
is the number of threads to be used. nthread
must be
less than ncol(x) / 5
.
NOTE: Even in serial version of admm_lasso()
, most matrix
operations are implicitly parallelized when proper compiler options are
turned on. Hence the parallel version of admm_lasso()
is not
necessarily faster than the serial one.
This member function will implicitly return the "ADMM_Lasso
" object itself.
Additional options related to ADMM algorithm can be set through the
$opts()
member function of an "ADMM_Lasso
" object. The usage of
this method is
1 2 | model$opts(maxit = 10000, eps_abs = 1e-5, eps_rel = 1e-5,
rho = NULL)
|
Here model
is the object returned by admm_lasso()
.
Explanation of the arguments is given below:
maxit
Maximum number of iterations.
eps_abs
Absolute tolerance parameter.
eps_rel
Relative tolerance parameter.
rho
ADMM step size parameter. If set to NULL
, the program
will compute a default one.
This member function will implicitly return the "ADMM_Lasso
" object itself.
Model will be fit after calling the $fit()
member function. This is no
argument that needs to be set. The function will return an object of class
"ADMM_Lasso_fit
", which contains the following fields:
lambda
The sequence of λ to build the solution path.
beta
A sparse matrix containing the estimated coefficient vectors, each column for one λ. Intercepts are in the first row.
niter
Number of ADMM iterations.
Class "ADMM_Lasso_fit
" also contains a $plot()
member function,
which plots the coefficient paths with the sequence of λ.
See the examples below.
Yixuan Qiu <http://statr.me>
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 | set.seed(123)
n = 100
p = 20
b = runif(p)
x = matrix(rnorm(n * p, mean = 1.2, sd = 2), n, p)
y = 5 + c(x %*% b) + rnorm(n)
## Directly fit the model
admm_lasso(x, y)$fit()
## Or, if you want to have more customization:
model = admm_lasso(x, y)
print(model)
## Specify the lambda sequence
model$penalty(nlambda = 20, lambda_min_ratio = 0.01)
## Lower down precision for faster computation
model$opts(maxit = 100, eps_rel = 0.001)
## Use parallel computing (not necessary for this small dataset here)
# model$parallel(nthread = 2)
## Inspect the updated model setting
print(model)
## Fit the model and do the actual computation
res = model$fit()
res$beta
## Create a solution path plot
res$plot()
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.