Description Details Author(s) Examples
Simultaneous feature selection and parameter estimation for classification.
Suitable for high dimensional multiclass classification with many classes.
The algorithm computes the sparse group lasso penalized maximum likelihood estimate.
Use of parallel computing for cross validation and subsampling is supported through the foreach
and doParallel
packages.
Development version is on GitHub, please report package issues on GitHub.
For a classification problem with K classes and p features (covariates) dived into m groups.
The multinomial logistic regression with sparse group lasso penalty estimator is a sequence of minimizers (one for each lambda given in the lambda
argument) of
\hat R(β) + λ ≤ft( (1-α) ∑_{J=1}^m γ_J \|β^{(J)}\|_2 + α ∑_{i=1}^{n} ξ_i |β_i| \right)
where \hat R is the weighted empirical log-likelihood risk of the multinomial regression model. The vector β^{(J)} denotes the parameters associated with the J'th group of features (default is one covariate per group, hence the default dimension of β^{(J)} is K). The group weights γ \in [0,∞)^m and parameter weights ξ \in [0,∞)^n may be explicitly specified.
Martin Vincent
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 | # Load some data
data(PrimaryCancers)
# A quick look at the data
dim(x)
table(classes)
# A smaller subset with three classes
small <- which(classes %in% c("CCA", "CRC", "Pancreas"))
classes <- classes[small, drop = TRUE]
x <- x[small, ]
#Do cross validation using 2 parallel units
cl <- makeCluster(2)
registerDoParallel(cl)
# Do 4-fold cross validation on a lambda sequence of length 100.
# The sequence is decreasing from the data derived lambda.max to 0.2*lambda.max
fit.cv <- msgl::cv(x, classes, fold = 4, lambda = 0.2, use_parallel = TRUE)
stopCluster(cl)
# Print information about models
# and cross validation errors (estimated expected generalization error)
fit.cv
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.