grppenalty: Compute the solution for the concave 1-norm and 2-norm group...

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/grppenalty.R

Description

Compute the solution for the concave 1-norm and 2-norm group penalties in linear and logistic models.

Usage

1
2
3
grppenalty(y, x, index, family = "gaussian",
type = "l1", penalty = "mcp", kappa = 1/2.7, nlambda = 100, lambda.min = 0.01,
epsilon = 1e-3, maxit = 1e+3 )

Arguments

y

outcome of interest. A vector of continuous response in linear models or a vector of 0 or 1 in logistic models.

x

the design matrix of penalized variables. By default, an intercept vector will be added when fitting the model.

index

group index of penalized variables.

family

a character indicating the distribution of outcome. Either "gaussian" or "binomial" can be specified.

type

a character specifying the type of grouped penalty. Either "l1" or "l2" can be specified, with "l1" being the default. See following details for more information.

penalty

a character specifying the penalty. One of "mcp" or "scad" should be specified, with "mcp" being the default.

kappa

the regularization parameter kappa, either one value or an increasing vector of values can be specified. The value of kappa should be in the range of [0,1).

nlambda

a integer value specifying the number of grids along the penalty parameter lambda.

lambda.min

a value specifying how to determine the minimal value of penalty parameter lambda. We define lambda_min=lambda_max*lambda.min. We suggest lambda.min is 0.0001 if n>p; 0.01 otherwise.

epsilon

a value specifying the converge criterion of algorithm.

maxit

an integer value specifying the maximum number of iterations for each coordinate.

Details

The package implements the concave 1-norm and 2-norm group penalties in linear and logistic regression models. The concave 1-norm group penalty is defined as rho(|beta|_1;d*lambda,kappa) with |beta|_1 being the L1 norm of the coefficients and d being the group size. The concave 2-norm group penalty is defined as rho(|beta|_2;sqrt(d)*lambda,kappa) with |beta|_2 being the L2 norm of the coefficients. Here rho() is the concave function, in current implementation, we only consider the smoothly clipped absolute deviation (SCAD) penalty and minimum concave penalty (MCP).

The concave 1-norm group penalties, i.e. 1-norm gSCAD or gMCP, perform variable selection at group and individual levels under proper tuning parameters. The concave 2-norm group penalties, i.e. 2-norm gSCAD or gMCP selects variable at group level, i.e. the variables in the same group are dropped or selected at the same time. One advantage of of the 1-norm group penalty is that it is robust to mis-specified group information. The 2-norm group penalty is, however, affected by the mis-specified group information. The concave 2-norm group penalty includes group Lasso as a special case when the regularization parameter kappa=0. Hence, setting kappa=0 in the 2-norm group penalty returns the group Lasso solutions.

We use the coordinate descent algorithm (CDA) to compute the solution for both the 1-norm and 2-norm group penalties. The solution path is computed along kappa. That is we use the solution at kappa=0 to initiate the computation for a given penalty parameter lambda. In general, we suggest treating both the regularization parameter kappa and penalty parameter lambda as tuning parameters and use data-driven approach to select optimal kappa and lambda. However, this practice requires heavy computation, thus, a particular kappa (1/2.7 for gMCP and 1/3.7 for gSCAD) is recommended to reduce the computational time.

For tuning parameter selection, we implement the cross-validation approach for both linear and logistic models. In linear model, we use the predictive mean square error (PMSE) as the index quantity. The tuning parameter(s) corresponding to the solution with the minimum pmse is selected. In logistic model, the k-fold cross-validated area under ROC curve (CV-AUC) is used as the index quantity. The tuning parameter(s) corresponding to the solution with the maximum CV-AUC is selected.

Value

A list of three elements is returned.

coef.beta

a list of nkappa elements for regression coefficients, with nkappa being the length of kappa value specified.

kappa

a sequence of regularization parameter kappa.

lambda

a sequence of penalty parameter lambda used in the computation.

Author(s)

Dingfeng Jiang

References

Jiang, D., Huang, J., Zhang, Y. (2011). The cross-validated AUC for MCP-Logistic regression with high-dimensional data. Statistical Methods in Medical Research, online first.

Yuan, M., Lin, Y. (2006). Model selection and estimation in regression with grouped variables. Journal of Royal Statistical Society Series B, 68 (1): 49 - 67.

Meier, L., van de Geer, S., B\ā€¯uhlmann, P., (2008). The group lasso for logistic regression. Journal of Royal Statistical Society Series B, 70 (1): 53 - 71

See Also

cv.grppenalty

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
set.seed(10000)
n=100
ybi=rbinom(n,1,0.4)
yga=rnorm(n)
p=20
x=matrix(rnorm(n*p),n,p)
index=rep(1:10, each =2)
## one kappa
out=grppenalty(yga, x, index, "gaussian", "l1", "mcp",  1/2.7)
## out=grppenalty(yga, x, index, "gaussian", "l2", "mcp",  1/2.7)
## out=grppenalty(yga, x, index, "gaussian", "l1", "scad",  1/2.7)
## out=grppenalty(yga, x, index, "gaussian", "l2", "scad",  1/2.7)
## multiple kappas
## out=grppenalty(yga, x, index, "gaussian", "l1", "mcp",  c(0,1/2.7))

## out=grppenalty(ybi, x, index, "binomial", "l1", "mcp",  1/2.7)
## out=grppenalty(ybi, x, index, "binomial", "l2", "mcp",  1/2.7)
## out=grppenalty(ybi, x, index, "binomial", "l1", "scad",  1/2.7)
## out=grppenalty(ybi, x, index, "binomial", "l2", "scad",  1/2.7)

Example output



grppenalty documentation built on May 30, 2017, 4:33 a.m.