subsampling: Multinomial sparse group lasso generic subsampling procedure
In msgl: Multinomial Sparse Group Lasso

Description Usage Arguments Value Author(s) Examples

Multinomial sparse group lasso generic subsampling procedure using multiple possessors

subsampling(x, classes, sampleWeights = NULL, grouping = NULL,
  groupWeights = NULL, parameterWeights = NULL, alpha = 0.5,
  standardize = TRUE, lambda, d = 100, training, test,
  intercept = TRUE, sparse.data = is(x, "sparseMatrix"),
  collapse = FALSE, max.threads = NULL, use_parallel = FALSE,
  algorithm.config = msgl.standard.config)

`x`	design matrix, matrix of size N \times p.
`classes`	classes, factor of length N.
`sampleWeights`	sample weights, a vector of length N.
`grouping`	grouping of features (covariates), a vector of length p. Each element of the vector specifying the group of the feature.
`groupWeights`	the group weights, a vector of length m (the number of groups). If `groupWeights = NULL` default weights will be used. Default weights are 0 for the intercept and √{K\cdot\textrm{number of features in the group}} for all other weights.
`parameterWeights`	a matrix of size K \times p. If `parameterWeights = NULL` default weights will be used. Default weights are is 0 for the intercept weights and 1 for all other weights.
`alpha`	the α value 0 for group lasso, 1 for lasso, between 0 and 1 gives a sparse group lasso penalty.
`standardize`	if TRUE the features are standardize before fitting the model. The model parameters are returned in the original scale.
`lambda`	lambda.min relative to lambda.max or the lambda sequence for the regularization path (that is a vector or a list of vectors with the lambda sequence for the subsamples).
`d`	length of lambda sequence (ignored if `length(lambda) > 1`)
`training`	a list of training samples, each item of the list corresponding to a subsample. Each item in the list must be a vector with the indices of the training samples for the corresponding subsample. The length of the list must equal the length of the `test` list.
`test`	a list of test samples, each item of the list corresponding to a subsample. Each item in the list must be vector with the indices of the test samples for the corresponding subsample. The length of the list must equal the length of the `training` list.
`intercept`	should the model include intercept parameters
`sparse.data`	if TRUE `x` will be treated as sparse, if `x` is a sparse matrix it will be treated as sparse by default.
`collapse`	if `TRUE` the results for each subsample will be collapse into one result (this is useful if the subsamples are not overlapping)
`max.threads`	Deprecated (will be removed in 2018), instead use `use_parallel = TRUE` and registre parallel backend (see package 'doParallel'). The maximal number of threads to be used.
`use_parallel`	If `TRUE` the `foreach` loop will use `%dopar%`. The user must registre the parallel backend.
`algorithm.config`	the algorithm configuration to be used.

`link`	the linear predictors – a list of length `length(test)` with each element of the list another list of length `length(lambda)` one item for each lambda value, with each item a matrix of size K \times N containing the linear predictors.
`response`	the estimated probabilities – a list of length `length(test)` with each element of the list another list of length `length(lambda)` one item for each lambda value, with each item a matrix of size K \times N containing the probabilities.
`classes`	the estimated classes – a list of length `length(test)` with each element of the list a matrix of size N \times d with d=`length(lambda)`.
`features`	number of features used in the models.
`parameters`	number of parameters used in the models.
`classes.true`	a list of length `length(training)`, containing the true classes used for estimation

Martin Vincent

data(SimData)

# A quick look at the data
dim(x)
table(classes)

test <- list(1:20, 21:40)
train <- lapply(test, function(s) (1:length(classes))[-s])

# Run subsampling
# Using a lambda sequence ranging from the maximal lambda to 0.5 * maximal lambda
fit.sub <- msgl::subsampling(x, classes, alpha = 0.5, lambda = 0.5, training = train, test = test)

# Print some information
fit.sub

# Mean misclassification error of the tests
Err(fit.sub)

# Negative log likelihood error
Err(fit.sub, type="loglike")