In Saptati-Datta/AdapLasso: Computes Adaptive Lasso Using Lars Algorithm

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

Lasso is one of the most popular techniques to fit linear regression by using a penalty which sets some of the coefficients to be zero, thereby facilitating variable selection and simultaneous estimation.However, there exists certain scenarios where ordinary Lasso turns out to be inconsistent. The main reason for such a discrepancy is that Lasso does not always enjoy oracle properties. In such a case, it is convenient to take resort to Adaptive Lasso technique which satisfies the oracle properties, that is, it has the ability to perform the same asymptotically, as if we knew the true specification of the model beforehand. Such an algorithm was proposed byZou(2006).Suppose \pmb{$\hat{\beta}$} is a root-n consistent estimator of \pmb{$\beta^$} (the initial estimate could be the OLS estimator or the estimate obtained by fitting Ridge regression). By picking a $\gamma >0$, we define a weight vector \textbf{w} by \pmb{$\hat{w}$}$=\frac{1}{|\pmb{\hat{\beta}|}^\gamma}$.Then the adaptive lasso estimate is given by $$\pmb{\hat\beta^(n)}=\arg\min_{\pmb{\beta}} ||y-\sum_{j=1}^p x_{j}\beta_{j}||^2 + \lambda_n \sum_{j=1}^{p} \hat{w_j}|\beta_j| ,$$\ where the tuning parameter $\lambda_n$ varies with n.\ The package can be installed and loaded using the following command.

devtools::install_github("Saptati-Datta/AdapLasso", build_vignettes = TRUE)
library(AdapLasso)

Fitting Adaptive Lasso on "swiss" data in R\

Here we will discuss the uses of several functions in this package using "swiss" dataset.It is an inbuilt dataset in R which consists of information regarding standardized fertility measure and socio-economic indicators for each of 47 French-speaking provinces of Switzerland at about 1888.The daa frame has 47 observations on 6 variables, each of which is in perncent,i.e., in [0, 100].The variable "Catholic" was scaled to [0, 1]. Here, we take fertility measure as our response variable and socio-economic indicators, namely, "Agriculture", "Education", "Catholic", "Infant Mortality" as our independent variables.

Loading the data:

data <- swiss
#Defining the variables
#Matrix of Covariates
X <- as.matrix(cbind(data$Agriculture,data$Education,data$Catholic,data$Infant.Mortality))
#Response vector
Y <- data$Fertility

The function fitadaplasso fits adaptive lasso to a given data. We can fit an adaptive lasso based on the given data in the following way:

fit <- fitadaplasso(X, Y)##All other parameters are free to take default values
##Deriving the accurate lambda sequence
lambda_vec <- fit$tuning_seq
##Plotting the lambda sequence
plot(fit$tuning_seq, colSums(fit$beta_lamb != 0), ylab = "Non-zero beta", xlab = "tuning_seq")
lines(fit$tuning_seq, colSums(fit$beta_lamb != 0))
##Estimated intercept
beta0 <- fit$intercept_vec
##Matrix of estimated parameters 
#Each column corresponding to a particular lambda
betamat <- fit$beta_lamb

However, the most important task that remains is to find the pair $(\lambda, \gamma)$ which minimizes the cross-validation error.If we use $\hat {\beta_{ols}}$ to find the adaptive weights,we would want to find an optimal pair of $(\lambda_n,\gamma)$.In such a case, we use two-dimensional cross-validation(as suggested by Zou)to tune the adaptive lasso.For a given $\gamma$ we can use cross-validation along with the LARS algorithm to exclusively search for the optimal $\lambda_n$.We use the function cv.lambda to find such a $\lambda_n$ and the particular value of $\lambda$ for which the cross-validation is minimum, given a default $\gamma$, as follows:

fit_cv <- cv.lambda(X, Y)
#Finding the optimal tuning_seq
lambda_opt_vec <- fit_cv$tuning_seq
#Finding the desired value of lambda 
lambda.min <- fit_cv$lambda_min

Tuning is very important in practice. Suppose we use the ordinary OLS estimator of \beta,$\hat \beta_{ols}$ to construct the adaptive weights.We then want to find the optimal pair of $(\gamma,\lambda_n)$.We use two dimensional cross-validation to tune the adaptive lasso.According to Zou, we can also replace $\hat\beta_{ols}$ with other consistent estimators.Hence, it can be treated as a third tuning parameters and we can perform three-dimensional cross-validation to find an optimal triple $(\hat\beta,\gamma,\lambda_n)$.\ Here,in order to find the optimal $\gamma$ value, we can perform another cross-validation using the cv.gamma function. We find the pair of ($\lambda,\gamma$) corresponding to which the entry in the cvm matrix is minimum.

fit_gamma <- cv.gamma(X, Y)
#Finding the minimum gamma
gamma <- fit_gamma$gamma_min
gamma
#Finding the minimum lambda
lambda <- fit_gamma$lambda_min
lambda

lambda and gamma as obtained from above gives the desired optimal pair.\ Usually, it is suggested to use $\hat\beta_{ols}$ for the adaptive weights unless collinearity is a concern. Otherwise, it is preferable to use the ridge estimator.In this package, the function scale_X combines both these conditions. It takes the OLS estimator if X has full rank and the ridge estimator if collinearity is detected to accout for the adaptive weights.In order to scale X according to LARS algorithm, we can use the following function:

gamma <- 1
scale_X(X, Y, gamma)

Note that, for the purpose of fitting adaptive lasso using functions like fitadaplasso``,cv.lambda,cv.gamma``` one do not need to scale the inputs as the functions have been built in a way such that they scale the user inputs.\

Fitting adaptive lasso on standardized data\

In order to fit adaptive lasso on standardized data, one may use the functions adaplassostd_lambda,adaplassostdseq_lambda.

lambda <- cv.lambda(X, Y)$lambda_min
lambda_seq <- cv.lambda(X, Y)$tuning_seq
##Scaling and centering X and Y
Xstd <- standardize(X, Y, gamma)$Xstd
Ystd <- standardize(X, Y, gamma)$Ystd
##Fitting on standardized data for one particular pair of values of gamma and lambda
adaplassostd_lambda(Xstd, Ystd , lambda = lambda)

##Fitting on standardized for a fixed gamma and a sequence of lambda values
adaplassostdseq_lambda(Xstd, Ystd, tuning_seq = lambda_seq)