Fits the regularization paths for large margin classifiers
Description
Fits a regularization path for large margin classifiers at a sequence of regularization parameters lambda.
Usage
1 2 3 4 5 6 7  gcdnet(x, y, nlambda = 100,
method = c("hhsvm", "logit", "sqsvm", "ls"),
lambda.factor = ifelse(nobs < nvars, 0.01, 1e04),
lambda = NULL, lambda2 = 0,
pf = rep(1, nvars), pf2 = rep(1, nvars), exclude,
dfmax = nvars + 1, pmax = min(dfmax * 1.2,
nvars), standardize = TRUE, eps = 1e8, maxit = 1e6, delta = 2)

Arguments
x 
matrix of predictors, of dimension N*p; each row is an observation vector. 
y 
response variable. This argument should be a twolevel factor for classification. 
nlambda 
the number of 
method 
a character string specifying the loss function to use, valid options are:
Default is 
lambda.factor 
The factor for getting the minimal lambda in 
lambda 
a user supplied 
lambda2 
regularization parameter lambda2 for the quadratic penalty of the coefficients. 
pf 
L1 penalty factor of length p used for adaptive LASSO or adaptive elastic net. Separate L1 penalty weights can be applied to each coefficient of beta to allow
differential L1 shrinkage. Can be 0 for some variables, which implies
no L1 shrinkage, and results in that variable always being included in the
model. Default is 1 for all variables (and implicitly infinity for
variables listed in 
pf2 
L2 penalty factor of length p used for adaptive LASSO or adaptive elastic net. Separate L2 penalty weights can be applied to each coefficient of beta to allow differential L2 shrinkage. Can be 0 for some variables, which implies no L2 shrinkage. Default is 1 for all variables. 
exclude 
indices of variables to be excluded from the model. Default is none. Equivalent to an infinite penalty factor. 
dfmax 
limit the maximum number of variables in the model. Useful for very large p, if a partial path is desired. Default is p+1. 
pmax 
limit the maximum number of variables ever to be nonzero. For example once β enters the model, no matter how many times it exits or reenters model through the path, it will be counted only once. Default is 
standardize 
logical flag for variable standardization, prior to
fitting the model sequence. If 
eps 
convergence threshold for coordinate majorization descent. Each inner
coordinate majorization descent loop continues until the relative change in any
coefficient (i.e. max(j)(beta_new[j]beta_old[j])^2) is less than 
maxit 
maximum number of outerloop iterations allowed at fixed lambda value. Default is 1e6. If models do not converge, consider increasing 
delta 
the parameter delta in HHSVM model. Default is 2. 
Details
Note that the objective function in gcdnet
is
Loss(y, X, beta))/N + lambda1 * beta + 0.5 * lambda2 * beta^2
where the penalty is a combination of L1 and L2 term. Users can specify the loss function to use, options include Huberized squared hinge loss, Squared hinge loss, least square loss and logistic regression. Users can also tweak the penalty by choosing different lambda2 and penalty factor.
For computing speed reason, if models are not converging or running slow, consider increasing eps
, decreasing
nlambda
, or increasing lambda.factor
before increasing
maxit
.
FAQ:
Question: “I couldn't get an idea how to specify an option to get adaptive LASSO, how to specify an option to get elastic net and adaptive elastic net? Could you please give me a quick hint?”
Answer: lambda2
is the regularize parameter for L2 penalty part. To use LASSO, set lambda2=0
. To use elastic net, set lambda2
as nonzero.
pf
is the L1 penalty factor of length p (p is the number of predictors). Separate L1 penalty weights can be applied to each coefficient to allow differential L1 shrinkage. Similiarly pf2
is the L2 penalty factor of length p.
To use adaptive LASSO, you should set lambda2=0
and also specify pf
and pf2
. To use adaptive elastic net, you should set lambda2
as nonzero and specify pf
and pf2
,
For example
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27  library('gcdnet')
# Dataset N = 100, p = 10
x_log < matrix(rnorm(100*10),100,10)
y_log < sample(c(1,1),100,replace=TRUE)
# LASSO
m < gcdnet(x=x_log,y=y_log,lambda2=0,method="log")
plot(m)
# elastic net with lambda2 = 1
m < gcdnet(x=x_log,y=y_log,lambda2=1,method="log")
plot(m)
# adaptive lasso with penalty factor
# pf = 0.5 0.5 0.5 0.5 0.5 1.0 1.0 1.0 1.0 1.0
m < gcdnet(x=x_log,y=y_log,lambda2=0,method="log",
pf=c(rep(0.5,5),rep(1,5)))
plot(m)
# adaptive elastic net with lambda2 = 1 and penalty factor pf = c(rep(0.5,5),rep(1,5))
# pf2 = 3 3 3 3 3 1 1 1 1 1
m < gcdnet(x=x_log,y=y_log,lambda2=1,method="log",
pf=c(rep(0.5,5),rep(1,5)),
pf2 = c(rep(3,5),rep(1,5)))
plot(m)

Question: “what is the meaning of the parameter pf
? On the package documentation, it said pf
is the penalty weight applied to each coefficient of beta?”
Answer: Yes, pf
and pf2
are L1 and L2 penalty factor of length p used for adaptive LASSO or adaptive elastic net. 0 means that the feature (variable) is always excluded, 1 means that the feature (variable) is included with weight 1.
Question: “Does gcdnet deal with both continuous and categorical response variables?”
Answer: Yes, both are supported, you can use a continuous type response variable with the least squares regression loss, or a categorical type response with losses for classification problem.
Question: “Why does predict function not work? predict should return the predicted probability of the positive class. Instead I get:”
1 2 3 4 5 6 7  Error in as.matrix(as.matrix(cbind2(1, newx))
error in evaluating the argument 'x' in selecting
a method for function 'as.matrix': Error in t(.Call(Csparse_dense_crossprod, y,
t(x))) :
error in evaluating the argument 'x' in selecting
a method for function 't': Error: Cholmod error 'X and/or Y have wrong dimensions'
at file ../MatrixOps/cholmod_sdmult.c, line 90?

“Using the Arcene dataset and executing the following code will give the above error:”
1 2 3 4 5 
Answer: It is actually NOT a bug of gcdnet. When make prediction using a new matrix x, each observation of x should be arranged as a row of a matrix. In your code, because "pred" is a vector, you need to convert "pred" into a matrix, try the following code:
1 2 3 
Value
An object with S3 class gcdnet
.
call 
the call that produced this object 
b0 
intercept sequence of length 
beta 
a 
lambda 
the actual sequence of 
df 
the number of nonzero coefficients for each value of

dim 
dimension of coefficient matrix (ices) 
npasses 
total number of iterations (the most inner loop) summed over all lambda values 
jerr 
error flag, for warnings and errors, 0 if no error. 
Author(s)
Yi Yang and Hui Zou
Maintainer: Yi Yang <yiyang@umn.edu>
References
Yang, Y. and Zou, H. (2012), "An Efficient Algorithm for Computing The HHSVM and Its Generalizations," Journal of Computational and Graphical Statistics, 22, 396415.
BugReport: http://code.google.com/p/gcdnet/
See Also
plot.gcdnet
Examples
1 2 3 4 5 6 7 8 9 10 11 12 13  data(FHT)
# solution paths for the least squares
m0 < gcdnet(x=FHT$x,y=FHT$y_reg,lambda2=1,method="ls")
plot(m0)
# solution paths for the HHSVM
m1 < gcdnet(x=FHT$x,y=FHT$y,delta=1,lambda2=1,method="hhsvm")
plot(m1)
# solution paths for the penalized SVM with the squared hinge loss
m2 < gcdnet(x=FHT$x,y=FHT$y,lambda2=0.1,method="sqsvm")
plot(m2)
# solution paths for the penalized logistic regression
m3 < gcdnet(x=FHT$x,y=FHT$y,lambda2=0.01,method="logit")
plot(m3)
