subsampling: Multinomial sparse group lasso generic subsampling procedure

Description Usage Arguments Value Author(s) Examples

Description

Multinomial sparse group lasso generic subsampling procedure using multiple possessors

Usage

1
2
3
4
5
6
subsampling(x, classes, sampleWeights = NULL, grouping = NULL,
  groupWeights = NULL, parameterWeights = NULL, alpha = 0.5,
  standardize = TRUE, lambda, d = 100, training, test,
  intercept = TRUE, sparse.data = is(x, "sparseMatrix"),
  collapse = FALSE, max.threads = NULL, use_parallel = FALSE,
  algorithm.config = msgl.standard.config)

Arguments

x

design matrix, matrix of size N \times p.

classes

classes, factor of length N.

sampleWeights

sample weights, a vector of length N.

grouping

grouping of features (covariates), a vector of length p. Each element of the vector specifying the group of the feature.

groupWeights

the group weights, a vector of length m (the number of groups). If groupWeights = NULL default weights will be used. Default weights are 0 for the intercept and

√{K\cdot\textrm{number of features in the group}}

for all other weights.

parameterWeights

a matrix of size K \times p. If parameterWeights = NULL default weights will be used. Default weights are is 0 for the intercept weights and 1 for all other weights.

alpha

the α value 0 for group lasso, 1 for lasso, between 0 and 1 gives a sparse group lasso penalty.

standardize

if TRUE the features are standardize before fitting the model. The model parameters are returned in the original scale.

lambda

lambda.min relative to lambda.max or the lambda sequence for the regularization path (that is a vector or a list of vectors with the lambda sequence for the subsamples).

d

length of lambda sequence (ignored if length(lambda) > 1)

training

a list of training samples, each item of the list corresponding to a subsample. Each item in the list must be a vector with the indices of the training samples for the corresponding subsample. The length of the list must equal the length of the test list.

test

a list of test samples, each item of the list corresponding to a subsample. Each item in the list must be vector with the indices of the test samples for the corresponding subsample. The length of the list must equal the length of the training list.

intercept

should the model include intercept parameters

sparse.data

if TRUE x will be treated as sparse, if x is a sparse matrix it will be treated as sparse by default.

collapse

if TRUE the results for each subsample will be collapse into one result (this is useful if the subsamples are not overlapping)

max.threads

Deprecated (will be removed in 2018), instead use use_parallel = TRUE and registre parallel backend (see package 'doParallel'). The maximal number of threads to be used.

use_parallel

If TRUE the foreach loop will use %dopar%. The user must registre the parallel backend.

algorithm.config

the algorithm configuration to be used.

Value

link

the linear predictors – a list of length length(test) with each element of the list another list of length length(lambda) one item for each lambda value, with each item a matrix of size K \times N containing the linear predictors.

response

the estimated probabilities – a list of length length(test) with each element of the list another list of length length(lambda) one item for each lambda value, with each item a matrix of size K \times N containing the probabilities.

classes

the estimated classes – a list of length length(test) with each element of the list a matrix of size N \times d with d=length(lambda).

features

number of features used in the models.

parameters

number of parameters used in the models.

classes.true

a list of length length(training), containing the true classes used for estimation

Author(s)

Martin Vincent

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
data(SimData)

# A quick look at the data
dim(x)
table(classes)

test <- list(1:20, 21:40)
train <- lapply(test, function(s) (1:length(classes))[-s])

# Run subsampling
# Using a lambda sequence ranging from the maximal lambda to 0.5 * maximal lambda
fit.sub <- msgl::subsampling(x, classes, alpha = 0.5, lambda = 0.5, training = train, test = test)

# Print some information
fit.sub

# Mean misclassification error of the tests
Err(fit.sub)

# Negative log likelihood error
Err(fit.sub, type="loglike")

msgl documentation built on May 8, 2019, 9:03 a.m.