Fit a (Generalized Linear) Model with Lasso

Share:

Description

The lasso() methods fit a (generalized) linear model by the (group-) lasso and include an adaptive option. The typically recommended usage is formula method.

lassoGrpFit is the lower level fitting function typically not called by the user.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
lasso(x, ...)

## S3 method for class 'formula'
lasso(formula, data, subset, weights, na.action, offset, nonpen = ~ 1,
      model="gaussian", lambda=NULL, lstep = 21, adaptive = FALSE,
      cv.function = cv.lasso, penscale = sqrt,
      cv=NULL, adaptcoef = NULL, adaptlambda = NULL,
      contrasts = NULL, save.x = TRUE,
      center=NA, standardize = TRUE,
      control = lassoControl(), ...)

## Default S3 method:
lasso(x, y, index, subset, weights = rep(1, length(y)), model='gaussian',
      lambda=NULL, lstep = 21,
      adaptive = FALSE, cv.function = cv.lasso, cv=NULL,
      adaptcoef = NULL, adaptlambda = NULL, penscale = sqrt,
      center=NA, standardize = TRUE,
      save.x = TRUE, control = lassoControl(), ...)


## S3 method for class 'lassogrp'
lasso(x, lambda=NULL, lstep=21,
      cv = NULL, cv.function = cv.lasso,
      adaptcoef = NULL, adaptlambda = NULL,
      penscale = sqrt, weights = NULL,
      center = NA, standardize = TRUE,
      save.x = TRUE, control = lassoControl(), ...)

lassoGrpFit(x, y, index, subset, weights = rep(1, length(y)), model="gaussian",
           offset = rep(0, length(y)), lambda = NULL, lstep = 21,
           coef.init = rep(0, ncol(x)), penscale = sqrt,
           center = NA, standardize = TRUE,
           save.x = NULL, control = lassoControl(), ...)

Arguments

x

for default method: design matrix, including column for intercept
for lassogrp method: a 'lassogrp' object i.e., the result of a first call to lasso

formula

model formula for the penalized terms, (see nonpen for the non-penalized terms).

data

data.frame containing the variables in the model.

y

response variable (for default method)

index

a vector indicating which carriers should be penalized by the L1 term. This is usually obtained from calling lassogrpModelmatrix. See Details. (For default method)

subset, weights, na.action

as in other model fitting functions

model

type of model to be fitted: Either one of

"gaussian"

ordinary linear model,

"binomial"

logistic regression,

"poisson"

Poisson regression, or

a model generated by lassoModel

offset

vector of offset values; needs to have the same length as the response vector. May be useful in logistic and Poisson regression.

nonpen

formula defining the model terms that must not be included in the lasso penalty term.

lambda

vector of scaling factors of the lasso penalty term.

lstep

number of lambda values to be chosen if lambda is not provided.

cv.function

function used for cross validation.

cv

results of cross validation, if available.

adaptive

logical: should the adaptive lasso be used? If TRUE, the lasso will be called twice, first in the regular mode, then adaptive to the results of the first call.

adaptcoef

inverse weights for the coefficients, used for the adaptive lasso. By default, they are obtained as the coefficients of the result of an earlier call (object$coefficients, (for lassogrp method), or such a first call is invoked (for formula method).

adaptlambda

lambda value used to extract the coefficients for adaptcoef. Either a lambda value or a negative integer, in which case the coefficients corresponding to the abs(adaptlambda)th lambda of object. If adaptlambda is null, it will be selected by cross validation.

contrasts

an optional list. See the contrasts.arg of model.matrix.default.

penscale

rescaling function to adjust the value of the penalty parameter to the degrees of freedom of the parameter group. See the reference below.

center

logical. If true, the columns of the design matrix will be centered (except a possible intercept column); NA, the default, corresponds to TRUE iff there is an intercept.

standardize

logical. I f true, the design matrix will be blockwise orthonormalized such that for each block X'X = n 1 (after possible centering).

save.x

logical: should the model matrix be stored in the return value?

coef.init

numeric; initial vector of parameter estimates corresponding to lambda[1].

control

list of items to control the algorithm, see lassoControl

...

further arguments, potentially passed on to next methods.

Details

The index defines the groups of carriers and whether they are included in the L1 penalization term. There is an element in index for each carrier (column of the model.matrix). Carriers j with positive index[j] are included in the penalization. Elements sharing the same index[j] value are a group in the sense of the group lasso, that is, their coefficients will be included in the L1 term as sqrt(sum(coef^2)).

For the formula method, the grouping of the variables is derived from the type of the variables: The dummy variables of a factor will be automatically treated as a group.

The lasso optimization process starts using the largest value of lambda as penalty parameter λ. Once fitted, the next largest element of lambda is used as penalty parameter with starting values defined as the (fitted) coefficient vector based on the previous component of lambda.

Value

lasso() returns a "lassogrp" object for which coef, print, plot and predict methods exist. It has (list) components

coefficients

coefficients with respect to the original input variables (even if standardize = TRUE is used for fitting).

norms.pen

single terms of the L1 penalty term

nloglik

log likelihood

fn.val
fitted

fitted values (response type)

linear.predictors

linear predictors

lambda

vector of lambda values where coefficients were calculated.

index

grouping index vector.

...

and further components, apply names(.) or str(.) to the object.

Author(s)

Lukas Meier and Werner Stahel, stahel@stat.math.ethz.ch

References

Lukas Meier, Sara van de Geer and Peter B\"uhlmann (2008), The Group Lasso for Logistic Regression, Journal of the Royal Statistical Society, 70 (1), 53–71;
see also http://stat.ethz.ch/~meier/logistic-grouplasso.php

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
## Lasso for asphalt example
data(asphalt)
rr <- lasso(log10(RUT) ~ log10(VISC) + ASPH+BASE+FINES+VOIDS+RUN,
            data = asphalt)
rr
names(rr)

## Use the Logistic Group Lasso on the splice data set
data(splice)

## Define a list with the contrasts of the factors
contr <- rep(list("contr.sum"), ncol(splice) - 1)
names(contr) <- names(splice)[-1]

## Fit a logistic model
fit.splice <-
  lasso(y ~ ., data = splice, model = "binomial", lambda = 20,
  contrasts = contr)
fit.splice

## a potentially large model: all two-way interactions
fit.spl.2 <-
  lasso(y ~ .^2, data = splice, model = "binomial", lambda = 20,
  contrasts = contr)
## However, it's "the same" (since lambda is the same):
c1 <- coef(fit.splice)
c2 <- coef(fit.spl.2)
stopifnot(all.equal(c1, c2[rownames(c1), ,drop=FALSE]),
          c2[ !(rownames(c2) %in% rownames(c1)) ,] == 0)

## less to write for the same model, using update():%- we test update():
f..spl.2 <- update(fit.splice, ~ .^2)


## Perform the Logistic Group Lasso on a random dataset
set.seed(100) # {a "good choice" ..}
n <- 50  ## observations
p <- 4   ## variables

## First variable (intercept) not penalized, two groups having 2 degrees
## of freedom each :
index <- c(0, 2, 2, 3, 3)

## Create a random design matrix, including the intercept (first column)
x <- cbind("Intercept" = 1,
            matrix(rnorm(p * n), nrow = n,
                   dimnames=list(NULL, paste("X", 1:4, sep = ""))))

truec <- c(0, 2.1, -1.8, 0, 0)
prob <- 1 / (1 + exp(-x %*% truec))
mean(pmin(prob, 1 - prob)) ## Bayes risk
y <- rbinom(n, size = 1, prob = prob) ## binary response vector

## Use a multiplicative grid for the penalty parameter lambda, starting
## at the maximal lambda value
l.max <- lambdamax(x, y = y, index = index, penscale = sqrt, model = LogReg())
l.max # 11.15947
lambda <- l.max * 0.5^(0:5)

## Fit the solution path on the lambda grid
fit <- lasso(x, y = y, index = index, lambda = lambda, model = "binomial",
                control = lassoControl(trace = 1))

## Plot coefficient paths
plot(fit)