Tuning parameter selection by k-fold cross validation for logistic models with Lasso-concave hybrid penalty

Share:

Description

Using k-fold cross-validated area under ROC curve to select tuning parameter for high-dimensional logistic model with Lasso-concave hybrid penalty

Usage

1
2
3
cv.hybrid(y, x, penalty = "mcp", nfold = 5,
kappa = 1/2.7, nlambda = 100, lambda.min = 0.01,
epsilon = 1e-3, maxit = 1e+3, seed = 1000)

Arguments

y

response vector with elements 0 or 1.

x

the design matrix of penalized variables. By default, an intercept vector will be added when fitting the model.

penalty

a character specifying the penalty. One of "mcp" or "scad" should be specified, with "mcp" being the default.

nfold

an integer value for k-fold cross validation.

kappa

a value specifying the regulation parameter kappa. The proper range for kappa is [0, 1).

nlambda

an integer value specifying the number of grids along the penalty parameter lambda.

lambda.min

a value specifying how to determine the minimal value of penalty parameter lambda. We define lambda_min=lambda_max*lambda.min. We suggest lambda.min=0.0001 if n>p; 0.01 otherwise.

epsilon

a value specifying the converge criterion of algorithm.

maxit

an integer value specifying the maximum number of iterations for each coordinate.

seed

randomization seed for cross validation.

Details

A Lasso-concave hybrid penalty applies SCAD or MCP penalty only to the variables selected by Lasso. The idea is to use Lasso as a screen tool to filter variables, then apply the SCAD or MCP penalty to the variables selected by Lasso for further selection. The computation for the hybrid penalty is faster than the standard concave penalty. The risk of using the hybrid penalty is that the variable missed by Lasso penalty will also not selected by the SCAD/MCP penalty.

We also use the CV-AUC approach to select tuning parameter for models using the Lasso-concave hybrid penalty.

Value

A list of three elements is returned.

scvauc

the CV-AUC corresponding to the selected lambda.

slambda

the selected lambda.

scoef

the regression coefficients corresponding to the selected lambda, with the first element being the intercept.

Author(s)

Dingfeng Jiang

References

Dingfeng Jiang, Jian Huang. Majorization Minimization by Coordinate Descent for Concave Penalized Generalized Linear Models.

Zou, H., Li, R. (2008). One-step Sparse Estimates in Nonconcave Penalized Likelihood Models. Ann Stat, 364: 1509-1533.

Breheny, P., Huang, J. (2011). Coordinate Descent Algorithms for Nonconvex Penalized Regression, with Application to Biological Feature Selection. Ann Appl Stat, 5(1), 232-253.

Jiang, D., Huang, J., Zhang, Y. (2011). The Cross-validated AUC for MCP-Logistic Regression with High-dimensional Data. Stat Methods Med Res, online first, Nov 28, 2011.

See Also

cvplogistic, hybrid.logistic, cv.cvplogistic, path.plot

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
set.seed(10000)
n=100
y=rbinom(n,1,0.4)
p=10
x=matrix(rnorm(n*p),n,p)

## Lasso-concave hybrid using MCP penalty
out=cv.hybrid(y, x, "mcp")
## Lasso-concave hybrid using SCAD penalty
## out=cv.hybrid(y, x, "scad")