# cv: Cross Validation In nielsrhansen/msgl: Multinomial Sparse Group Lasso

## Description

Multinomial sparse group lasso cross validation, with or without parallel backend.

## Usage

 1 2 3 4 5 6 cv(x, classes, sampleWeights = NULL, grouping = NULL, groupWeights = NULL, parameterWeights = NULL, alpha = 0.5, standardize = TRUE, lambda, d = 100, fold = 10L, cv.indices = list(), intercept = TRUE, sparse.data = is(x, "sparseMatrix"), max.threads = NULL, use_parallel = FALSE, algorithm.config = msgl.standard.config) 

## Arguments

 x design matrix, matrix of size N \times p. classes classes, factor of length N. sampleWeights sample weights, a vector of length N. grouping grouping of features (covariates), a vector of length p. Each element of the vector specifying the group of the feature. groupWeights the group weights, a vector of length m (the number of groups). If groupWeights = NULL default weights will be used. Default weights are 0 for the intercept and √{K\cdot\textrm{number of features in the group}} for all other weights. parameterWeights a matrix of size K \times p. If parameterWeights = NULL default weights will be used. Default weights are is 0 for the intercept weights and 1 for all other weights.#' alpha the α value 0 for group lasso, 1 for lasso, between 0 and 1 gives a sparse group lasso penalty. standardize if TRUE the features are standardize before fitting the model. The model parameters are returned in the original scale. lambda lambda.min relative to lambda.max or the lambda sequence for the regularization path. d length of lambda sequence (ignored if length(lambda) > 1) fold the fold of the cross validation, an integer larger than 1 and less than N+1. Ignored if cv.indices != NULL. If fold≤max(table(classes)) then the data will be split into fold disjoint subsets keeping the ration of classes approximately equal. Otherwise the data will be split into fold disjoint subsets without keeping the ration fixed. cv.indices a list of indices of a cross validation splitting. If cv.indices = NULL then a random splitting will be generated using the fold argument. intercept should the model include intercept parameters sparse.data if TRUE x will be treated as sparse, if x is a sparse matrix it will be treated as sparse by default. max.threads Deprecated (will be removed in 2018), instead use use_parallel = TRUE and registre parallel backend (see package 'doParallel'). The maximal number of threads to be used. use_parallel If TRUE the foreach loop will use %dopar%. The user must registre the parallel backend. algorithm.config the algorithm configuration to be used.

## Value

 link the linear predictors – a list of length length(lambda) one item for each lambda value, with each item a matrix of size K \times N containing the linear predictors. response the estimated probabilities - a list of length length(lambda) one item for each lambda value, with each item a matrix of size K \times N containing the probabilities. classes the estimated classes - a matrix of size N \times d with d=length(lambda). cv.indices the cross validation splitting used. features number of features used in the models. parameters number of parameters used in the models. classes.true the true classes used for estimation, this is equal to the classes argument

Martin Vincent

## Examples

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 data(SimData) # A quick look at the data dim(x) table(classes) # Setup clusters cl <- makeCluster(2) registerDoParallel(cl) # Run cross validation using 2 clusters # Using a lambda sequence ranging from the maximal lambda to 0.7 * maximal lambda fit.cv <- msgl::cv(x, classes, alpha = 0.5, lambda = 0.7, use_parallel = TRUE) # Stop clusters stopCluster(cl) # Print some information fit.cv # Cross validation errors (estimated expected generalization error) # Misclassification rate Err(fit.cv) # Negative log likelihood error Err(fit.cv, type="loglike") 

nielsrhansen/msgl documentation built on May 28, 2019, 11:05 a.m.