knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
Lasso is one of the most popular techniques to fit linear regression by using a penalty which sets some of the coefficients to be zero, thereby facilitating variable selection and simultaneous estimation.However, there exists certain scenarios where ordinary Lasso turns out to be inconsistent. The main reason for such a discrepancy is that Lasso does not always enjoy oracle properties. In such a case, it is convenient to take resort to Adaptive Lasso technique which satisfies the oracle properties, that is, it has the ability to perform the same asymptotically, as if we knew the true specification of the model beforehand. Such an algorithm was proposed byZou(2006).Suppose \pmb{$\hat{\beta}$} is a root-n consistent estimator of \pmb{$\beta^$} (the initial estimate could be the OLS estimator or the estimate obtained by fitting Ridge regression). By picking a $\gamma >0$, we define a weight vector \textbf{w} by \pmb{$\hat{w}$}$=\frac{1}{|\pmb{\hat{\beta}|}^\gamma}$.Then the adaptive lasso estimate is given by $$\pmb{\hat\beta^(n)}=\arg\min_{\pmb{\beta}} ||y-\sum_{j=1}^p x_{j}\beta_{j}||^2 + \lambda_n \sum_{j=1}^{p} \hat{w_j}|\beta_j| ,$$\ where the tuning parameter $\lambda_n$ varies with n.\ The package can be installed and loaded using the following command.
devtools::install_github("Saptati-Datta/AdapLasso", build_vignettes = TRUE) library(AdapLasso)
Here we will discuss the uses of several functions in this package using "swiss" dataset.It is an inbuilt dataset in R which consists of information regarding standardized fertility measure and socio-economic indicators for each of 47 French-speaking provinces of Switzerland at about 1888.The daa frame has 47 observations on 6 variables, each of which is in perncent,i.e., in [0, 100].The variable "Catholic" was scaled to [0, 1]. Here, we take fertility measure as our response variable and socio-economic indicators, namely, "Agriculture", "Education", "Catholic", "Infant Mortality" as our independent variables.
Loading the data:
data <- swiss #Defining the variables #Matrix of Covariates X <- as.matrix(cbind(data$Agriculture,data$Education,data$Catholic,data$Infant.Mortality)) #Response vector Y <- data$Fertility
The function fitadaplasso
fits adaptive lasso to a given data.
We can fit an adaptive lasso based on the given data in the following way:
fit <- fitadaplasso(X, Y)##All other parameters are free to take default values ##Deriving the accurate lambda sequence lambda_vec <- fit$tuning_seq ##Plotting the lambda sequence plot(fit$tuning_seq, colSums(fit$beta_lamb != 0), ylab = "Non-zero beta", xlab = "tuning_seq") lines(fit$tuning_seq, colSums(fit$beta_lamb != 0)) ##Estimated intercept beta0 <- fit$intercept_vec ##Matrix of estimated parameters #Each column corresponding to a particular lambda betamat <- fit$beta_lamb
However, the most important task that remains is to find the pair $(\lambda, \gamma)$ which minimizes the cross-validation error.If we use $\hat {\beta_{ols}}$ to find the adaptive weights,we would want to find an optimal pair of $(\lambda_n,\gamma)$.In such a case, we use two-dimensional cross-validation(as suggested by Zou)to tune the adaptive lasso.For a given $\gamma$ we can use cross-validation along with the LARS algorithm to exclusively search for the optimal $\lambda_n$.We use the function cv.lambda
to find such a $\lambda_n$ and the particular value of $\lambda$ for which the cross-validation is minimum, given a default $\gamma$, as follows:
fit_cv <- cv.lambda(X, Y) #Finding the optimal tuning_seq lambda_opt_vec <- fit_cv$tuning_seq #Finding the desired value of lambda lambda.min <- fit_cv$lambda_min
Tuning is very important in practice. Suppose we use the ordinary OLS estimator of \beta,$\hat \beta_{ols}$ to construct the adaptive weights.We then want to find the optimal pair of $(\gamma,\lambda_n)$.We use two dimensional cross-validation to tune the adaptive lasso.According to Zou, we can also replace $\hat\beta_{ols}$ with other consistent estimators.Hence, it can be treated as a third tuning parameters and we can perform three-dimensional cross-validation to find an optimal triple $(\hat\beta,\gamma,\lambda_n)$.\
Here,in order to find the optimal $\gamma$ value, we can perform another cross-validation using the cv.gamma
function. We find the pair of ($\lambda,\gamma$) corresponding to which the entry in the cvm
matrix is minimum.
fit_gamma <- cv.gamma(X, Y) #Finding the minimum gamma gamma <- fit_gamma$gamma_min gamma #Finding the minimum lambda lambda <- fit_gamma$lambda_min lambda
lambda
and gamma
as obtained from above gives the desired optimal pair.\
Usually, it is suggested to use $\hat\beta_{ols}$ for the adaptive weights unless collinearity is a concern. Otherwise, it is preferable to use the ridge estimator.In this package, the function scale_X
combines both these conditions. It takes the OLS estimator if X has full rank and the ridge estimator if collinearity is detected to accout for the adaptive weights.In order to scale X according to LARS algorithm, we can use the following function:
gamma <- 1 scale_X(X, Y, gamma)
Note that, for the purpose of fitting adaptive lasso using functions like fitadaplasso``,
cv.lambda,
cv.gamma``` one do not need to scale the inputs as the functions have been built in a way such that they scale the user inputs.\
In order to fit adaptive lasso on standardized data, one may use the functions adaplassostd_lambda
,adaplassostdseq_lambda
.
lambda <- cv.lambda(X, Y)$lambda_min lambda_seq <- cv.lambda(X, Y)$tuning_seq ##Scaling and centering X and Y Xstd <- standardize(X, Y, gamma)$Xstd Ystd <- standardize(X, Y, gamma)$Ystd ##Fitting on standardized data for one particular pair of values of gamma and lambda adaplassostd_lambda(Xstd, Ystd , lambda = lambda) ##Fitting on standardized for a fixed gamma and a sequence of lambda values adaplassostdseq_lambda(Xstd, Ystd, tuning_seq = lambda_seq)
References\ 1.Hui Zou: The Adaptive Lasso and Its Oracle Properties\ 2.Émeline Courtois,corresponding author Pascale Tubert-Bitter, and Ismaïl Ahmed: New adaptive lasso approaches for variable selection in automated pharmacovigilance signal detection
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.