grplasso: Function to Fit a Solution of a Group Lasso Problem
In grplasso: Fitting User-Specified Models with Group Lasso Penalty

Description Usage Arguments Details Value Author(s) References Examples

Fits the solution of a group lasso problem for a model of type grpl.model.

grplasso(x, ...)

## S3 method for class 'formula'
grplasso(formula, nonpen = ~ 1, data, weights,
         subset, na.action, lambda, coef.init, penscale = sqrt,
         model = LogReg(), center = TRUE, standardize = TRUE,
         control = grpl.control(), contrasts = NULL, ...)

## Default S3 method:
grplasso(x, y, index, weights = rep(1, length(y)), offset = rep(0,
         length(y)), lambda, coef.init = rep(0, ncol(x)),
         penscale = sqrt, model = LogReg(), center = TRUE,
         standardize = TRUE, control = grpl.control(), ...)

`x`	design matrix (including intercept)
`y`	response vector
`formula`	`formula` of the penalized variables. The response has to be on the left hand side of `~`.
`nonpen`	`formula` of the nonpenalized variables. This will be added to the `formula` argument above and doesn't need to have the response on the left hand side.
`data`	`data.frame` containing the variables in the model.
`index`	vector which defines the grouping of the variables. Components sharing the same number build a group. Non-penalized coefficients are marked with `NA`.
`weights`	vector of observation weights.
`subset`	an optional vector specifying a subset of observations to be used in the fitting process.
`na.action`	a function which indicates what should happen when the data contain 'NA's.
`offset`	vector of offset values; needs to have the same length as the response vector.
`lambda`	vector of penalty parameters. Optimization starts with the first component. See details below.
`coef.init`	initial vector of parameter estimates corresponding to the first component in the vector `lambda`.
`penscale`	rescaling function to adjust the value of the penalty parameter to the degrees of freedom of the parameter group. See the reference below.
`model`	an object of class `grpl.model` implementing the negative log-likelihood, gradient, hessian etc. See the documentation of `grpl.model` for more details.
`center`	logical. If true, the columns of the design matrix will be centered (except a possible intercept column).
`standardize`	logical. If true, the design matrix will be blockwise orthonormalized such that for each block X^TX = n 1 (after possible centering).
`control`	options for the fitting algorithm, see `grpl.control`.
`contrasts`	an optional list. See the 'contrasts.arg' of 'model.matrix.default'.
`...`	additional arguments to be passed to the functions defined in `model`.

When using grplasso.formula, the grouping of the variables is derived from the type of the variables: The dummy variables of a factor will be automatically treated as a group.

The optimization process starts using the first component of lambda as penalty parameter λ and with starting values defined in coef.init for the parameter vector. Once fitted, the next component of lambda is considered as penalty parameter with starting values defined as the (fitted) coefficient vector based on the previous component of lambda.

A grplasso object is returned, for which coef, print, plot and predict methods exist.

`coefficients`	coefficients with respect to the original input variables (even if `standardize = TRUE` is used for fitting).
`lambda`	vector of lambda values where coefficients were calculated.
`index`	grouping index vector.

Lukas Meier, meier@stat.math.ethz.ch

Lukas Meier, Sara van de Geer and Peter B\"uhlmann (2008), The Group Lasso for Logistic Regression, Journal of the Royal Statistical Society, 70 (1), 53 - 71

## Use the Logistic Group Lasso on the splice data set
data(splice)

## Define a list with the contrasts of the factors
contr <- rep(list("contr.sum"), ncol(splice) - 1)
names(contr) <- names(splice)[-1]

## Fit a logistic model 
fit.splice <- grplasso(y ~ ., data = splice, model = LogReg(), lambda = 20,
                       contrasts = contr, center = TRUE, standardize = TRUE)

## Perform the Logistic Group Lasso on a random dataset
set.seed(79)

n <- 50  ## observations
p <- 4   ## variables

## First variable (intercept) not penalized, two groups having 2 degrees
## of freedom each

index <- c(NA, 2, 2, 3, 3)

## Create a random design matrix, including the intercept (first column)
x <- cbind(1, matrix(rnorm(p * n), nrow = n))
colnames(x) <- c("Intercept", paste("X", 1:4, sep = ""))

par <- c(0, 2.1, -1.8, 0, 0)
prob <- 1 / (1 + exp(-x %*% par))
mean(pmin(prob, 1 - prob)) ## Bayes risk
y <- rbinom(n, size = 1, prob = prob) ## binary response vector

## Use a multiplicative grid for the penalty parameter lambda, starting
## at the maximal lambda value
lambda <- lambdamax(x, y = y, index = index, penscale = sqrt,
                    model = LogReg()) * 0.5^(0:5)

## Fit the solution path on the lambda grid
fit <- grplasso(x, y = y, index = index, lambda = lambda, model = LogReg(),
                penscale = sqrt,
                control = grpl.control(update.hess = "lambda", trace = 0))

## Plot coefficient paths
plot(fit)

Lambda: 20  nr.var: 16 
There were 50 or more warnings (use warnings() to see the first 50)
[1] 0.1265707
There were 50 or more warnings (use warnings() to see the first 50)