samHL: Training function of Sparse Additive Machine

Description

The classifier is learned using training data.

Usage

1
2
samHL(X, y, p=3, lambda = NULL, nlambda = NULL, 
lambda.min.ratio = 0.4, thol=1e-5, mu = 5e-2, max.ite = 1e5)

Arguments

X

The n by d design matrix of the training set, where n is sample size and d is dimension.

y

The n-dimensional label vector of the training set, where n is sample size. Labels must be coded in 1 and 0.

p

The number of baisis spline functions. The default value is 3.

lambda

A user supplied lambda sequence. Typical usage is to have the program compute its own lambda sequence based on nlambda and lambda.min.ratio. Supplying a value of lambda overrides this. WARNING: use with care. Do not supply a single value for lambda. Supply instead a decreasing sequence of lambda values. samHL relies on its warms starts for speed, and its often faster to fit a whole path than compute a single fit.

nlambda

The number of lambda values. The default value is 20.

lambda.min.ratio

Smallest value for lambda, as a fraction of lambda.max, the (data derived) entry value (i.e. the smallest value for which all coefficients are zero). The default is 0.4.

thol

Stopping precision. The default value is 1e-5.

mu

Smoothing parameter used in approximate the Hinge Loss. The default value is 0.05.

max.ite

The number of maximum iterations. The default value is 1e5.

Details

We adopt various computational algorithms including the block coordinate descent, fast iterative soft-thresholding algorithm, and newton method. The computation is further accelerated by "warm-start" and "active-set" tricks.

Value

p

The number of baisis spline functions used in training.

X.min

A vector with each entry corresponding to the minimum of each input variable. (Used for rescaling in testing)

X.ran

A vector with each entry corresponding to the range of each input variable. (Used for rescaling in testing)

lambda

A sequence of regularization parameter used in training.

w

The solution path matrix (d*p+1 by length of lambda) with each column corresponding to a regularization parameter. Since we use the basis expansion with the intercept, the length of each column is d*p+1.

df

The degree of freedom of the solution path (The number of non-zero component function)

knots

The p-1 by d matrix. Each column contains the knots applied to the corresponding variable.

Boundary.knots

The 2 by d matrix. Each column contains the boundary points applied to the corresponding variable.

func_norm

The functional norm matrix (d by length of lambda) with each column corresponds to a regularization parameter. Since we have d input variabls, the length of each column is d.

Author(s)

Tuo Zhao, Xingguo Li, Han Liu, Kathryn Roeder
Maintainers: Tuo Zhao<tourzhao@gmail.com>

References

P. Ravikumar, J. Lafferty, H.Liu and L. Wasserman. "Sparse Additive Models", Journal of Royal Statistical Society: Series B, 2009.
T. Zhao and H.Liu. "Sparse Additive Machine", International Conference on Artificial Intelligence and Statistics, 2012.

See Also

SAM,plot.samHL,print.samHL,predict.samHL

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
## generating training data
n = 200
d = 100
X = 0.5*matrix(runif(n*d),n,d) + matrix(rep(0.5*runif(n),d),n,d)
y = sign(((X[,1]-0.5)^2 + (X[,2]-0.5)^2)-0.06)

## flipping about 5 percent of y
y = y*sign(runif(n)-0.05) 

## Training
out.trn = samHL(X,y)
out.trn

## plotting solution path
plot(out.trn)

## generating testing data
nt = 1000
Xt = 0.5*matrix(runif(nt*d),nt,d) + matrix(rep(0.5*runif(nt),d),nt,d)

yt = sign(((Xt[,1]-0.5)^2 + (Xt[,2]-0.5)^2)-0.06)

## flipping about 5 percent of y
yt = yt*sign(runif(nt)-0.05) 

## predicting response
out.tst = predict(out.trn,Xt)

Questions? Problems? Suggestions? or email at ian@mutexlabs.com.

All documentation is copyright its authors; we didn't write any of that.