sparseSVM: Fit sparse linear SVM with lasso or elasti-net regularization

Description Usage Arguments Details Value Author(s) See Also Examples

Description

Fit solution paths for sparse linear SVM regularized by lasso or elastic-net over a grid of values for the regularization parameter lambda.

Usage

1
2
3
4
5
sparseSVM(X, y, alpha = 1, gamma = 0.1, nlambda=100, 
	  lambda.min = ifelse(nrow(X)>ncol(X), 0.01, 0.05), 
          lambda, preprocess = c("standardize", "rescale", "none"),  
          screen = c("ASR", "SR", "none"), max.iter = 1000, eps = 1e-5, 
          dfmax = ncol(X)+1, penalty.factor=rep(1, ncol(X)), message = FALSE)

Arguments

X

Input matrix.

y

Output vector. Currently the function only supports binary output and converts the output into +1/-1 coding internally.

alpha

The elastic-net mixing parameter that controls the relative contribution from the lasso and the ridge penalty. It must be a number between 0 and 1. alpha=1 is the lasso penalty and alpha=0 the ridge penalty.

gamma

The tuning parameter for huberization smoothing of hinge loss. Default is 0.1.

nlambda

The number of lambda values. Default is 100.

lambda.min

The smallest value for lambda, as a fraction of lambda.max, the data derived entry value. Default is 0.01 if the number of observations is larger than the number of variables and 0.05 otherwise.

lambda

A user-specified sequence of lambda values. Typical usage is to leave blank and have the program automatically compute a lambda sequence based on nlambda and lambda.min. Specifying lambda overrides this. This argument should be used with care and supplied with a decreasing sequence instead of a single value. To get coefficients for a single lambda, use coef or predict instead after fitting the solution path with sparseSVM.

preprocess

Preprocessing technique to be applied to the input. Either "standardize" (default), "rescale" or "none" (see Details). The coefficients are always returned on the original scale.

screen

Screening rule to be applied at each lambda that discards variables for speed. Either "ASR" (default), "SR" or "none". "SR" stands for the strong rule, and "ASR" for the adaptive strong rule. Using "ASR" typically requires fewer iterations to converge than "SR", but the computing time are generally close. Note that the option "none" is used mainly for debugging, which may lead to much longer computing time.

max.iter

Maximum number of iterations. Default is 1000.

eps

Convergence threshold. The algorithms continue until the maximum change in the objective after any coefficient update is less than eps times the null deviance. Default is 1E-7.

dfmax

Upper bound for the number of nonzero coefficients. The algorithm exits and returns a partial path if dfmax is reached. Useful for very large dimensions.

penalty.factor

A numeric vector of length equal to the number of variables. Each component multiplies lambda to allow differential penalization. Can be 0 for some variables, in which case the variable is always in the model without penalization. Default is 1 for all variables.

message

If set to TRUE, sparseSVM will inform the user of its progress. This argument is kept for debugging. Default is FALSE.

Details

The sequence of models indexed by the regularization parameter lambda is fitted using a semismooth Newton coordinate descent algorithm. The objective function is defined to be

∑ hingeLoss(y_i (x_i' w + b))/n + λ*penalty(w).

where

hingeLoss(t) = max(0, 1-t)

and the intercept b is unpenalized.

The program supports different types of preprocessing techniques. They are applied to each column of the input matrix X. Let x be a column of X. For preprocess = "standardize", the formula is

x' = (x-mean(x))/sd(x);

for preprocess = "rescale",

x' = (x-min(x))/(max(x)-min(x)).

The models are fit with preprocessed input, then the coefficients are transformed back to the original scale via some algebra.

Value

The function returns an object of S3 class "sparseSVM", which is a list containing:

call

The call that produced this object.

weights

The fitted matrix of coefficients. The number of rows is equal to the number of coefficients, and the number of columns is equal to nlambda. An intercept is included.

iter

A vector of length nlambda containing the number of iterations until convergence at each value of lambda.

saturated

A logical flag for whether the number of nonzero coefficients has reached dfmax.

lambda

The sequence of regularization parameter values in the path.

alpha

Same as above.

gamma

Same as above.

penalty.factor

Same as above.

levels

Levels of the output class labels.

Author(s)

Congrui Yi and Yaohui Zeng
Maintainer: Congrui Yi <eric.ycr@gmail.com>

See Also

plot.sparseSVM, cv.sparseSVM

Examples

1
2
3
4
5
6
7
8
9
X = matrix(rnorm(1000*100), 1000, 100)
b = 3
w = 5*rnorm(10)
eps = rnorm(1000)
y = sign(b + drop(X[,1:10] %*% w + eps))

fit = sparseSVM(X, y)
coef(fit, 0.05)
predict(fit, X[1:5,], lambda = c(0.2, 0.1))

Example output

Loading required package: parallel
(Intercept)          V1          V2          V3          V4          V5 
 -0.4729766   0.4685178   0.3527636   0.6406139  -0.8139670  -0.1796772 
         V6          V7          V8          V9         V10         V11 
 -0.4643762  -0.3854431  -0.3612029   0.0000000  -0.6262678   0.0000000 
        V12         V13         V14         V15         V16         V17 
  0.0000000   0.0000000   0.0000000   0.0000000   0.0000000   0.0000000 
        V18         V19         V20         V21         V22         V23 
  0.0000000   0.0000000   0.0000000   0.0000000   0.0000000   0.0000000 
        V24         V25         V26         V27         V28         V29 
  0.0000000   0.0000000   0.0000000   0.0000000   0.0000000   0.0000000 
        V30         V31         V32         V33         V34         V35 
  0.0000000   0.0000000   0.0000000   0.0000000   0.0000000   0.0000000 
        V36         V37         V38         V39         V40         V41 
  0.0000000   0.0000000   0.0000000   0.0000000   0.0000000   0.0000000 
        V42         V43         V44         V45         V46         V47 
  0.0000000   0.0000000   0.0000000   0.0000000   0.0000000   0.0000000 
        V48         V49         V50         V51         V52         V53 
  0.0000000   0.0000000   0.0000000   0.0000000   0.0000000   0.0000000 
        V54         V55         V56         V57         V58         V59 
  0.0000000   0.0000000   0.0000000   0.0000000   0.0000000   0.0000000 
        V60         V61         V62         V63         V64         V65 
  0.0000000   0.0000000   0.0000000   0.0000000   0.0000000   0.0000000 
        V66         V67         V68         V69         V70         V71 
  0.0000000   0.0000000   0.0000000   0.0000000   0.0000000   0.0000000 
        V72         V73         V74         V75         V76         V77 
  0.0000000   0.0000000   0.0000000   0.0000000   0.0000000   0.0000000 
        V78         V79         V80         V81         V82         V83 
  0.0000000   0.0000000   0.0000000   0.0000000   0.0000000   0.0000000 
        V84         V85         V86         V87         V88         V89 
  0.0000000   0.0000000   0.0000000   0.0000000   0.0000000   0.0000000 
        V90         V91         V92         V93         V94         V95 
  0.0000000   0.0000000   0.0000000   0.0000000   0.0000000   0.0000000 
        V96         V97         V98         V99        V100 
  0.0000000   0.0000000   0.0000000   0.0000000   0.0000000 
     0.2 0.1
[1,]   1  -1
[2,]   1   1
[3,]   1   1
[4,]   1   1
[5,]   1   1

sparseSVM documentation built on May 2, 2019, 11:02 a.m.