fit: Fit a multinomial sparse group lasso regularization path.
In msgl: Multinomial Sparse Group Lasso

Description Usage Arguments Details Value Author(s) Examples

Fit a sequence of multinomial logistic regression models using sparse group lasso, group lasso or lasso. In addition to the standard parameter grouping the algorithm supports further grouping of the features.

fit(x, classes, sampleWeights = NULL, grouping = NULL,
  groupWeights = NULL, parameterWeights = NULL, alpha = 0.5,
  standardize = TRUE, lambda, d = 100, return_indices = NULL,
  intercept = TRUE, sparse.data = is(x, "sparseMatrix"),
  algorithm.config = msgl.standard.config)

`x`	design matrix, matrix of size N \times p.
`classes`	classes, factor of length N.
`sampleWeights`	sample weights, a vector of length N.
`grouping`	grouping of features, a vector of length p. Each element of the vector specifying the group of the feature.
`groupWeights`	the group weights, a vector of length m (the number of groups). If `groupWeights = NULL` default weights will be used. Default weights are 0 for the intercept and √{K\cdot\textrm{number of features in the group}} for all other weights.
`parameterWeights`	a matrix of size K \times p. If `parameterWeights = NULL` default weights will be used. Default weights are is 0 for the intercept weights and 1 for all other weights.
`alpha`	the α value 0 for group lasso, 1 for lasso, between 0 and 1 gives a sparse group lasso penalty.
`standardize`	if TRUE the features are standardize before fitting the model. The model parameters are returned in the original scale.
`lambda`	lambda.min relative to lambda.max or the lambda sequence for the regularization path.
`d`	length of lambda sequence (ignored if `length(lambda) > 1`)
`return_indices`	the indices of lambda values for which to return a the fitted parameters.
`intercept`	should the model fit include intercept parameters (note that due to standardization the returned beta matrix will always have an intercept column)
`sparse.data`	if TRUE `x` will be treated as sparse, if `x` is a sparse matrix it will be treated as sparse by default.
`algorithm.config`	the algorithm configuration to be used.

For a classification problem with K classes and p features (covariates) dived into m groups. This function computes a sequence of minimizers (one for each lambda given in the lambda argument) of

\hat R(β) + λ ≤ft( (1-α) ∑_{J=1}^m γ_J \|β^{(J)}\|_2 + α ∑_{i=1}^{n} ξ_i |β_i| \right)

where \hat R is the weighted empirical log-likelihood risk of the multinomial regression model. The vector β^{(J)} denotes the parameters associated with the J'th group of features (default is one covariate per group, hence the default dimension of β^{(J)} is K). The group weights γ \in [0,∞)^m and parameter weights ξ \in [0,∞)^n may be explicitly specified.

`beta`	the fitted parameters – a list of length `length(lambda)` with each entry a matrix of size K\times (p+1) holding the fitted parameters
`loss`	the values of the loss function
`objective`	the values of the objective function (i.e. loss + penalty)
`lambda`	the lambda values used
`classes.true`	the true classes used for estimation, this is equal to the `classes` argument

Martin Vincent

data(SimData)

# A quick look at the data
dim(x)
table(classes)
# Fit multinomial sparse group lasso regularization path
# using a lambda sequence ranging from the maximal lambda to 0.5 * maximal lambda

fit <- msgl::fit(x, classes, alpha = 0.5, lambda = 0.5)

# Print some information about the fit
fit

# Model 10, i.e. the model corresponding to lambda[10]
models(fit)[[10]]

# The nonzero features of model 10
features(fit)[[10]]

# The nonzero parameters of model 10
parameters(fit)[[10]]

# The training errors of the models.
Err(fit, x)
# Note: For high dimensional models the training errors are almost always over optimistic,
# instead use msgl::cv to estimate the expected errors by cross validation