Fit a (Generalized Linear) Model with Lasso
Description
The lasso()
methods fit a (generalized) linear model by the
(group) lasso and include an adaptive option. The typically
recommended usage is formula method.
lassoGrpFit
is the lower level fitting function typically not
called by the user.
Usage
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33  lasso(x, ...)
## S3 method for class 'formula'
lasso(formula, data, subset, weights, na.action, offset, nonpen = ~ 1,
model="gaussian", lambda=NULL, lstep = 21, adaptive = FALSE,
cv.function = cv.lasso, penscale = sqrt,
cv=NULL, adaptcoef = NULL, adaptlambda = NULL,
contrasts = NULL, save.x = TRUE,
center=NA, standardize = TRUE,
control = lassoControl(), ...)
## Default S3 method:
lasso(x, y, index, subset, weights = rep(1, length(y)), model='gaussian',
lambda=NULL, lstep = 21,
adaptive = FALSE, cv.function = cv.lasso, cv=NULL,
adaptcoef = NULL, adaptlambda = NULL, penscale = sqrt,
center=NA, standardize = TRUE,
save.x = TRUE, control = lassoControl(), ...)
## S3 method for class 'lassogrp'
lasso(x, lambda=NULL, lstep=21,
cv = NULL, cv.function = cv.lasso,
adaptcoef = NULL, adaptlambda = NULL,
penscale = sqrt, weights = NULL,
center = NA, standardize = TRUE,
save.x = TRUE, control = lassoControl(), ...)
lassoGrpFit(x, y, index, subset, weights = rep(1, length(y)), model="gaussian",
offset = rep(0, length(y)), lambda = NULL, lstep = 21,
coef.init = rep(0, ncol(x)), penscale = sqrt,
center = NA, standardize = TRUE,
save.x = NULL, control = lassoControl(), ...)

Arguments
x 
for 
formula 
model 
data 

y 
response variable (for 
index 
a vector indicating which carriers should be penalized by
the L1 term. This is usually obtained from calling

subset, weights, na.action 
as in other model fitting functions 
model 
type of model to be fitted: Either one of

offset 
vector of offset values; needs to have the same length as the response vector. May be useful in logistic and Poisson regression. 
nonpen 

lambda 
vector of scaling factors of the lasso penalty term. 
lstep 
number of lambda values to be chosen if 
cv.function 

cv 
results of cross validation, if available. 
adaptive 
logical: should the adaptive lasso be used? If TRUE, the lasso will be called twice, first in the regular mode, then adaptive to the results of the first call. 
adaptcoef 
inverse weights for the coefficients, used for the adaptive
lasso. By default, they are obtained as the coefficients
of the result of an earlier call ( 
adaptlambda 
lambda value used to extract the coefficients
for 
contrasts 
an optional list. See the 
penscale 
rescaling function to adjust the value of the penalty parameter to the degrees of freedom of the parameter group. See the reference below. 
center 
logical. If true, the columns of the design matrix will be
centered (except a possible intercept column); 
standardize 
logical. I f true, the design matrix will be blockwise orthonormalized such that for each block X'X = n 1 (after possible centering). 
save.x 
logical: should the model matrix be stored in the return value? 
coef.init 
numeric; initial vector of parameter estimates
corresponding to 
control 
list of items to control the algorithm, see

... 
further arguments, potentially passed on to next methods. 
Details
The index
defines the groups of carriers and whether they are
included in the L1 penalization term.
There is an element in index
for each carrier (column of the
model.matrix). Carriers j
with positive index[j]
are
included in the penalization.
Elements sharing the same index[j]
value are a group in the
sense of the group lasso, that is, their coefficients will be
included in the L1 term as sqrt(sum(coef^2))
.
For the formula method, the grouping of the variables is derived from the type of the variables: The dummy variables of a factor will be automatically treated as a group.
The lasso optimization process starts using the largest value of
lambda
as penalty parameter λ. Once
fitted, the next largest element of lambda
is used as penalty
parameter with starting values defined as the (fitted) coefficient
vector based on the previous component of lambda
.
Value
lasso()
returns a "lassogrp"
object for which coef
,
print
, plot
and predict
methods exist.
It has (list) components
coefficients 
coefficients with respect to the original input
variables (even if 
norms.pen 
single terms of the L1 penalty term 
nloglik 
log likelihood 
fn.val 

fitted 
fitted values (response type) 
linear.predictors 
linear predictors 
lambda 
vector of lambda values where coefficients were calculated. 
index 
grouping index vector. 
... 
and further components, apply 
Author(s)
Lukas Meier and Werner Stahel, stahel@stat.math.ethz.ch
References
Lukas Meier, Sara van de Geer and Peter B\"uhlmann (2008),
The Group Lasso for Logistic Regression,
Journal of the Royal Statistical Society, 70 (1), 53–71;
see also
http://stat.ethz.ch/~meier/logisticgrouplasso.php
Examples
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65  ## Lasso for asphalt example
data(asphalt)
rr < lasso(log10(RUT) ~ log10(VISC) + ASPH+BASE+FINES+VOIDS+RUN,
data = asphalt)
rr
names(rr)
## Use the Logistic Group Lasso on the splice data set
data(splice)
## Define a list with the contrasts of the factors
contr < rep(list("contr.sum"), ncol(splice)  1)
names(contr) < names(splice)[1]
## Fit a logistic model
fit.splice <
lasso(y ~ ., data = splice, model = "binomial", lambda = 20,
contrasts = contr)
fit.splice
## a potentially large model: all twoway interactions
fit.spl.2 <
lasso(y ~ .^2, data = splice, model = "binomial", lambda = 20,
contrasts = contr)
## However, it's "the same" (since lambda is the same):
c1 < coef(fit.splice)
c2 < coef(fit.spl.2)
stopifnot(all.equal(c1, c2[rownames(c1), ,drop=FALSE]),
c2[ !(rownames(c2) %in% rownames(c1)) ,] == 0)
## less to write for the same model, using update():% we test update():
f..spl.2 < update(fit.splice, ~ .^2)
## Perform the Logistic Group Lasso on a random dataset
set.seed(100) # {a "good choice" ..}
n < 50 ## observations
p < 4 ## variables
## First variable (intercept) not penalized, two groups having 2 degrees
## of freedom each :
index < c(0, 2, 2, 3, 3)
## Create a random design matrix, including the intercept (first column)
x < cbind("Intercept" = 1,
matrix(rnorm(p * n), nrow = n,
dimnames=list(NULL, paste("X", 1:4, sep = ""))))
truec < c(0, 2.1, 1.8, 0, 0)
prob < 1 / (1 + exp(x %*% truec))
mean(pmin(prob, 1  prob)) ## Bayes risk
y < rbinom(n, size = 1, prob = prob) ## binary response vector
## Use a multiplicative grid for the penalty parameter lambda, starting
## at the maximal lambda value
l.max < lambdamax(x, y = y, index = index, penscale = sqrt, model = LogReg())
l.max # 11.15947
lambda < l.max * 0.5^(0:5)
## Fit the solution path on the lambda grid
fit < lasso(x, y = y, index = index, lambda = lambda, model = "binomial",
control = lassoControl(trace = 1))
## Plot coefficient paths
plot(fit)
