Description Usage Arguments Value Details Author(s) Examples
This function takes in inputs defined by the user and computes the optimum γ and λ for an Adaptive Lasso model. The function is very flexible and allows for many different settings such as, repeated error curves, different weighting methods and a definable γ grid. This function also fully supports multiple-cores parallelisation. The main fitting process is cv.glmnet() from the package glmnet.
1 2 3 4 |
data |
A well-cleaned |
x.indices |
The coordinates of the predictors that you would like to model with. Please provide a vecotr of locations e.g. seq(2,6). |
response |
The location of the response within the |
err.curves |
Due to the fact that the cross-validation process is random, the result can vary qutie a bit (if without a seed). In order to stabilise the CV process, the function fits a collection of Adaptive Lasso models, each with a different γ value, multiple times (Note, only different in γ, coefficents used for building the weights are the same). Therefore, for EACH γ, we will create multiple error curves over a range of λs and the optimum pair (γ, λ) is the pair that has the lowest averaged error curves value (local optimum). We have an optimum pair (γ, λ) for each γ and the global optimum pair is the pair that has the overall lowest averaged error curved value. Note, with this setting, the process tends to be slow. Thus, we highly suggest multiple-cores parallelisation. You can set this argument to 0 if you do not wish to stabilise the process, in which case the seed (1234567) will be used for the CV process. A postive integer indicates the stabilisation process is desired. For more information about how this works, please see section details below. |
weight.method |
The method that will be used to generate the inital set of cofficients which will then be used for constructing the initial weights for the Adpative Lasso. Under the current version, two methods are supplied: "OLS" or "ridge". The default is "OLS". |
gamma.seq |
A definable γ grid with range from 0 to ∞. Default is |
type.lambda |
Either "lambda.min" or "lambda.1se". Default is "lambda.min. Note when |
B.rep |
The number of residual bootstrappings to do for confidence intervals of the parameters. Default is 500. |
significance |
The significance level of the confidence intervals e.g. 100(1-α)%. Default is 0.05. |
interactive |
If you are running this function, please ALWAYS keep this argument to FALSE, which is the default. |
parallel |
parallelisation supported,default is FALSE. |
a list with elements:
seed |
if |
number of err.curves |
if |
best gamma |
if if |
best lambda |
A part of the global optimum pair (γ, λ). For more information on how the γ and λ get selected, please see the detail section. |
prediction error |
if if if if |
prediction_lower |
The lower bound for the prediction error. For more information see details below. |
prediction_upper |
The upper bound for the prediction error. For more information see details below. |
ridge lambda |
if |
%null deviance explained |
This can be seen as an indicator of goodness of fit. |
CIs |
The 100(1-α)% confidence intervals for the parameters. The confidence intervals are constructed by using residual bootstrapping.
The α level can be defined by the user. Please note that a CI of (. , .) means the algorithm failed to estimate
a valid CI for the corresponding coefficient. However, the proportion of non-zero estimates out of B.rep bootstraps will also be given, and thus, the user can
still gain some insight. |
This function further develops on the cv.glmnet() function from the glmnet package to allow for more flexibility.
The glmnet package itself does not directly support the fitting of Adapative Lasso models. This function wraps around the main fitting function cv.glmnet()
and thus, provides a direct fitting process of the Adpative Lasso model. The function, under this version of the package, offers two methods for the
construction of the initial weights, "OLS" and "ridge". If the "OLS" method has been selected, an lm()
object will be fitted and coefficients from the fit (except for the intercept) will
be used to create the initial weights. If the "ridge" method has been selected, an cv.glmnet(..., alpha = 0)
object will be fitted and the correpsonding coefficients (except for the intercept) will be used for building
the weights.
The function also offers an alternative to compute the global optimum (γ, λ) pair by averaging across the error curves instead of using a fixed seed. More specifically, for the "OLS" method
, after the coefficients have been obtained from a lm()
fit and are converted into the initial weights, the stabilisation process takes a double looping structure where the outer layer contains the γ grid and the inner loop builds multiple Adaptive Lasso models for each γ in the outer layer.
In this way, for each γ, we create say, B, Adaptive Lasso models and thus, this results in B error curves over a range of λs. Note, the weights are the same throughout this stabilisation process. For each γ then, the
function finds the local optimum pair (γ, λ) by averaging across these error curves and find the pair that has the lowest averaged cross validation errors. After the function finds
all the local optimum pairs, the golbal optimum pair is the pair that has the overall lowest averaged cross validation error. From experneice, for medium
size datasets, with err.curves
larger than 1500, the global optimum (γ, λ) will usually converge to stable values that consistently
achieves the overall lowest averaged across error curves value. This is a 2D stabilisation process.
When the method "ridge" is selected, the inital ridge coefficents is obtained by a stabilisation process that averages arcoss the error curves (The first stabilisation). Then after the coefficients have been obtained and converted into the inital weights, a 2-dimensional stabilisation process similar to the "OLS" method above will then takes place. Thus, when the "ridge" method is selected, we are stabilising a 3-dimensional process with the first dimension being the ridge coefficents recovery.
When err.curves > 0
, the 95 percent confidence interval for the prediction error (overall lowest averaged error curves value) is generated as follows:
from the corresponding error curves for the global optimum γ, the cross-validation scores corresponding to the global optimum (γ, λ) are extracted and the command quantile()
is then used to compute the 95 percent confidence interval. When we are not stabilising the process e.g.
err.curve = 0
, we compute the (γ, λ) pair with the seed (1234567) and the associated CI is computed by using the standard error
provided by the glmnet package and assuming normality.
Mokyo Zhou
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 | library(glmnet)
data(QuickStartExample)
#please NOTE: You can access "QuickStartExample" by using: data.frame(y,x).
#no error curves, weight.method="OLS", gamma.seq = c(0.5,1,2),type.lambda = "lambda.min"
result <- Adlasso(data = data.frame(y,x), x.indices = seq(2,21), response = 1, err.curves = 0 ,
weight.method = "OLS", gamma.seq = c(0.5,1,2), type.lambda="lambda.min")
# 100 error curves, weight.method="OLS", gamma.seq = seq(0,5), 2-cores parallel processing
#cl <- parallel::makeCluster(2)
#doParallel::registerDoParallel(cl)
result <- Adlasso(data = data.frame(y,x), x.indices = seq(2,21), response = 1, err.curves = 100,
weight.method = "OLS", gamma.seq = seq(0,5), parallel = TRUE)
# no error curves, weight.method = "ridge", gamma.seq = seq(1,10), type.lambda = "lambda.1se"
result <- Adlasso(data = data.frame(y,x), x.indices = seq(2,21), response = 1, err.curves = 0,
weight.method = "ridge", gamma.seq = seq(1,10), type.lambda = "lambda.1se")
#80 error curves, weight.method = "ridge", gamma.seq=c(0,0.5,1,2,2.5),with parallel (2 cores)
#cl <- parallel::makeCluster(2)
#doParallel::registerDoParallel(cl)
result <- Adlasso(data = data.frame(y,x), x.indices = seq(2,21), response = 1, err.curves = 80,
weight.method = "ridge", gamma.seq = c(0,0.5,1,2,2.5), parallel = TRUE)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.