lasso: Fit a (Generalized Linear) Model with (Adaptive) (Group)...
In lassogrp: Lasso Regression Including Group Lasso and Adaptive Lasso

Description Usage Arguments Details Value Author(s) References Examples

The lasso() methods fit a (generalized) linear model by the (group-) lasso and include an adaptive option. The typically recommended usage is formula method.

lassoGrpFit is the lower level fitting function typically not called by the user.

lasso(x, ...)

## S3 method for class 'formula'
lasso(formula, data, subset, weights, na.action, offset, nonpen = ~ 1,
      model="gaussian", lambda=NULL, steps = 21, adaptive = FALSE,
      cv.function = cv.lasso, penscale = sqrt,
      cv=NULL, adaptcoef = NULL, adaptlambda = NULL,
      contrasts = NULL, save.x = TRUE,
      center=NA, standardize = TRUE,
      control = lassoControl(), ...)

## Default S3 method:
lasso(x, y, index, subset, weights = rep(1, length(y)), model='gaussian',
      lambda=NULL, steps = 21,
      adaptive = FALSE, cv.function = cv.lasso, cv=NULL,
      adaptcoef = NULL, adaptlambda = NULL, penscale = sqrt,
      center=NA, standardize = TRUE,
      save.x = TRUE, control = lassoControl(), ...)


## S3 method for class 'lassogrp'
lasso(x, lambda=NULL, steps=21,
      cv = NULL, cv.function = cv.lasso,
      adaptcoef = NULL, adaptlambda = NULL,
      penscale = sqrt, weights = NULL, center = NA, 
      save.x = TRUE, control = lassoControl(), ...)

lassoGrpFit(x, y, index, subset, weights = rep(1, length(y)), model="gaussian",
           offset = rep(0, length(y)), lambda = NULL, steps = 21,
           coef.init = rep(0, ncol(x)), penscale = sqrt,
           center = NA, standardize = TRUE,
           save.x = NULL, control = lassoControl(), ...)

`x`	for `default` method: design matrix, including column for intercept for `lassogrp` method: a 'lassogrp' object i.e., the result of a first call to `lasso`
`formula`	model `formula` for the penalized terms, (see `nonpen` for the non-penalized terms).
`data`	`data.frame` containing the variables in the model.
`y`	response variable (for `default` method)
`index`	a vector indicating which carriers should be penalized by the L1 term. This is usually obtained from calling `lassogrpModelmatrix`. See Details. (For `default` method)
`subset, weights, na.action`	as in other model fitting functions
`model`	type of model to be fitted: Either one of "gaussian" ordinary linear model, "binomial" logistic regression, "poisson" Poisson regression, or a model generated by `lassoModel`
`offset`	vector of offset values; needs to have the same length as the response vector. May be useful in logistic and Poisson regression.
`nonpen`	`formula` defining the model terms that must not be included in the lasso penalty term.
`lambda`	vector of scaling factors of the lasso penalty term.
`steps`	number of lambda values to be chosen if `lambda` is not provided.
`cv.function`	`function` used for cross validation.
`cv`	results of cross validation, if available.
`adaptive`	`logical`: should the adaptive lasso be used? If `TRUE`, the lasso will be called twice, first in the regular mode, then adaptive to the results of the first call.
`adaptcoef`	inverse weights for the coefficients, used for the adaptive lasso. By default, they are obtained as the coefficients of the result of an earlier call (`object$coefficients`, (for `lassogrp` method), or such a first call is invoked (for `formula` method).
`adaptlambda`	lambda value used to extract the coefficients for `adaptcoef`. Either a lambda value or a negative integer, in which case the coefficients corresponding to the `abs(adaptlambda)`th lambda of `object`. If `adaptlambda` is `NULL` as by default, it will be selected by cross validation.
`contrasts`	an optional `list`. See the `contrasts.arg` of `model.matrix.default`.
`penscale`	rescaling function to adjust the value of the penalty parameter to the degrees of freedom of the parameter group. See the reference below.
`center`	logical. If true, the columns of the design matrix will be centered (except a possible intercept column); `NA`, the default, corresponds to `TRUE` iff there is an intercept.
`standardize`	logical. If true, the design matrix will be blockwise orthonormalized such that for each block X'X = n 1 (after possible centering).
`save.x`	logical: should the model matrix be stored in the return value?
`coef.init`	numeric; initial vector of parameter estimates corresponding to `lambda[1]`.
`control`	list of items to control the algorithm, see `lassoControl`

...

further arguments, potentially passed on to next methods.

The index defines the groups of carriers and whether they are included in the L1 penalization term. There is an element in index for each carrier (column of the model.matrix). Carriers j with positive index[j] are included in the penalization. Elements sharing the same index[j] value are a group in the sense of the group lasso, that is, their coefficients will be included in the L1 term as sqrt(sum(coef^2)).

For the formula method, the grouping of the variables is derived from the type of the variables: The dummy variables of a factor will be automatically treated as a group.

The lasso optimization process starts using the largest value of lambda as penalty parameter λ. Once fitted, the next largest element of lambda is used as penalty parameter with starting values defined as the (fitted) coefficient vector based on the previous component of lambda.

lasso() returns a "lassogrp" object for which coef, print, plot and predict methods exist. It has (list) components

`coefficients`	coefficients with respect to the original input variables (even if `standardize = TRUE` is used for fitting).
`norms.pen`	single terms of the L1 penalty term
`nloglik`	log likelihood
`fn.val`
`fitted`	fitted values (response type)
`linear.predictors`	linear predictors
`lambda`	vector of lambda values where coefficients were calculated.
`index`	grouping index vector.
`...`	and further components, apply `names(.)` or `str(.)` to the object.

Lukas Meier and Werner Stahel, stahel@stat.math.ethz.ch

Lukas Meier, Sara van de Geer and Peter B\"uhlmann (2008), The Group Lasso for Logistic Regression, Journal of the Royal Statistical Society, 70 (1), 53–71;

## Lasso for asphalt example
data(asphalt)
rr <- lasso(log10(RUT) ~ log10(VISC) + ASPH+BASE+FINES+VOIDS+RUN,
            data = asphalt)
rr
names(rr)
## ------------------
## Use the Logistic Group Lasso on the splice data set
data(splice)
## ------------------
## Define a list with the contrasts of the factors
contr <- rep(list("contr.sum"), ncol(splice) - 1)
names(contr) <- names(splice)[-1]
## ------------------
## Fit a logistic model
fit.splice <-
  lasso(y ~ ., data = splice, model = "binomial", lambda = 20,
  contrasts = contr)
fit.splice
## ------------------
## a potentially large model: all two-way interactions
fit.spl.2 <-
  lasso(y ~ .^2, data = splice, model = "binomial", lambda = 20,
  contrasts = contr)
## However, it's "the same" (since lambda is the same):
c1 <- coef(fit.splice)
c2 <- coef(fit.spl.2)
stopifnot(all.equal(c1, c2[rownames(c1), ,drop=FALSE]),
          c2[ !(rownames(c2) %in% rownames(c1)) ,] == 0)

## less to write for the same model, using update():%- we test update():
f..spl.2 <- update(fit.splice, ~ .^2)

## -------------------------------------------------------------------------
## Perform the Logistic Group Lasso on a random dataset
set.seed(100) # {a "good choice" ..}
n <- 50  ## observations
p <- 4   ## variables

## First variable (intercept) not penalized, two groups having 2 degrees
## of freedom each :
index <- c(0, 2, 2, 3, 3)

## Create a random design matrix, including the intercept (first column)
x <- cbind("Intercept" = 1,
            matrix(rnorm(p * n), nrow = n,
                   dimnames=list(NULL, paste("X", 1:4, sep = ""))))

truec <- c(0, 2.1, -1.8, 0, 0)
prob <- 1 / (1 + exp(-x %*% truec))
mean(pmin(prob, 1 - prob)) ## Bayes risk
y <- rbinom(n, size = 1, prob = prob) ## binary response vector

## Use a multiplicative grid for the penalty parameter lambda, starting
## at the maximal lambda value
l.max <- lambdamax(x, y = y, index = index, penscale = sqrt, model = LogReg())
l.max # 11.15947
lambda <- l.max * 0.5^(0:5)

## Fit the solution path on the lambda grid
fit <- lasso(x, y = y, index = index, lambda = lambda, model = "binomial",
                control = lassoControl(trace = 1))

## Plot coefficient paths
plot(fit)