# msgps: msgps (Model Selection Criteria via Generalized Path Seeking) In msgps: Degrees of freedom of elastic net, adaptive lasso and generalized elastic net

## Description

This package computes the degrees of freedom of the lasso, elastic net, generalized elastic net and adaptive lasso based on the generalized path seeking algorithm. The optimal model can be selected by model selection criteria including Mallows' Cp, bias-corrected AIC (AICc), generalized cross validation (GCV) and BIC.

## Usage

 1 2 msgps(X,y,penalty="enet", alpha=0, gamma=1, lambda=0.001, tau2, STEP=20000, STEP.max=200000, DFtype="MODIFIED", p.max=300, intercept=TRUE, stand.coef=FALSE) 

## Arguments

 X predictor matrix y response vector penalty The penalty term. The "enet" indicates the elastic net: α/2||β||_2^2+(1-α)||β||_1. Note that alpha=0 is the lasso penalty. The "genet" is the generalized elastic net: log(α+(1-α)||β||_1). The "alasso" is the adaptive lasso, which is a weighted version of the lasso given by w_i||β||_1, where w_i is 1/(\hat{β}_i)^{γ}. Here γ>0 is a tuning parameter, and \hat{β}_i is the ridge estimate with regularization parameter being λ ≥ 0. alpha The value of α on "enet" and "genet" penalty. gamma The value of γ on "alasso". lambda The value of regularization parameter λ ≥ 0 for ridge regression, which is used to calculate the weight vector of "alasso" penalty. Note that the ridge estimates can be ordinary least squared estimates when lambda=0. tau2 Estimator of error variance for Mallows' Cp. The default is the unbiased estimator of error vairance of the most complex model. When the unbiased estimator of error vairance of the most complex model is not available (e.g., the number of variables exceeds the number of samples), tau2 is the variance of response vector. STEP The approximate number of steps. STEP.max The number of steps in this algorithm can often exceed STEP. When the number of steps exceeds STEP.max, this algorithm stops. DFtype "MODIFIED" or "NAIVE". The "MODIFIED" update is much more efficient thatn "NAIVE" update. p.max If the number of selected variables exceeds p.max, the algorithm stops. intercept When intercept is TRUE, the result of intercept is included. stand.coef When stand.coef is TRUE, the standardized coefficient is displayed.

## Author(s)

Kei Hirose
[email protected]

## References

Friedman, J. (2008). Fast sparse regression and classification. Technical report, Standford University.
Hirose, K., Tateishi, S. and Konishi, S.. (2011). Efficient algorithm to select tuning parameters in sparse regression modeling with regularization. arXiv:1109.2411 (arXiv).

coef.msgps, plot.msgps, predict.msgps and summary.msgos objects.
  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 #data X <- matrix(rnorm(100*8),100,8) beta0 <- c(3,1.5,0,0,2,0,0,0) epsilon <- rnorm(100,sd=3) y <- X %*% beta0 + epsilon y <- c(y) #lasso fit <- msgps(X,y) summary(fit) coef(fit) #extract coefficients at t selected by model selection criteria coef(fit,c(0, 0.5, 2.5)) #extract coefficients at some values of t predict(fit,X[1:10,]) #predict values at t selected by model selection criteria predict(fit,X[1:10,],c(0, 0.5, 2.5)) #predict values at some values of t plot(fit,criterion="cp") #plot the solution path with a model selected by Cp criterion #elastic net fit2 <- msgps(X,y,penalty="enet",alpha=0.5) summary(fit2) #generalized elastic net fit3 <- msgps(X,y,penalty="genet",alpha=0.5) summary(fit3) #adaptive lasso fit4 <- msgps(X,y,penalty="alasso",gamma=1,lambda=0) summary(fit4)