Description Usage Arguments Details Value Author(s) References Examples
View source: R/SosDiscRobust.R
Robust and sparse multigroup classification by the optimal scoring approach is robust against outliers, provides a lowdimensional and sparse representation of the predictors and is also applicable if the number of variables exeeds the number of observations.
1 2 3 4 5 6 7 
formula 
A formula of the form 
data 
An optional data frame (or similar: see

subset 
An optional vector used to select rows (observations) of the
data matrix 
na.action 
A function which indicates what should happen
when the data contain 
x 
A matrix or data frame containing the explanatory variables (training set); colnames of x have to be provided. 
grouping 
Grouping variable: a factor specifying the class for each observation. 
prior 
Prior probabilities, a vector of positive numbers that sum up to 1; default to the class proportions for the training set. 
lambda 
A nonnegative tuning parameter for L1 norm penalty introducing sparsity on the
optimal scoring coefficients \boldsymbol{β}_h (see Details).
If the number of variables exceeds the number of observations 
Q 
Number of optimal scoring coefficient vectors; 
alpha 
Robustness parameter used in sparseLTS (for initial estimation, see Details). Default 
maxit 
Number of iterations for the estimation of optimal scoring coefficients and case weights. Default 
tol 
Tolerance for convergence of the normed weighted change in the residual sum of squares
for the estiamtion of optimal scoring coefficeints. Default is 
trace 
Whether to print intermediate results. Default is 
... 
Arguments passed to or from other methods. 
The sparse optimal scoring problem (Clemmensen et al, 2011): for h=1,....,Q
min{β_h,θ_h} 1/n Y θ_h  X β_h _2^2 + λ β_h_1
subject to
1/n θ_h^T Y^T Y θ_h = 1, θ_h^T Y^T Y θ_l = 0 for all l<h.
where X deontes the robustly centered and scaled input matrix x
(or alternativly the predictors from formular
) and Y is an dummy matrix coding die classmemberships from grouping
.
For each h this problem can be solved interatively for β_h and θ_h. In order to obtain robust estimates, β_h is estimated with reweighted sparse least trimmed squares regression (Alfons et al, 2013) and θ_h with least absolut deviation regression in the first two iterations. To speed up the following repetitions an iterative downweighting of observations with large residuals is combined with the iterative estimation of the optimal scoring coefficients with their classical estimates.
The classification model is estimated on the low dimensional sparse subspace X[β_1,...,β_Q] with robust LDA (Linda
).
An S4 object of class SosDiscRobustclass
which is a subclass of of the
virtual class SosDiscclass
.
Irene Hoffmann [email protected] and Valentin Todorov [email protected]
Clemmensen L, Hastie T, Witten D & Ersboll B (2011), Sparse discriminant analysis. Technometrics, 53(4), 406–413.
Alfons A, Croux C & Gelper S (2013), Sparse least trimmed squares regression for analysing highdimensional large data sets. The Annals of Applied Statistics, 7(1), 226–248.
Hoffmann I, Filzmoser P & Croux C (2016), Robust and sparse multigroup classification by the optimal scoring approach. Submitted for publication.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37  ## EXAMPLE 1 ######################################
data(olitos)
grind < which(colnames(olitos)=="grp")
set.seed(5008642)
mod < SosDiscRobust(grp~., data=olitos, lambda=0.3, maxIte=30, Q=3, tol=1e2)
pred < predict(mod, newdata=olitos[,grind])
summary(mod)
plot(mod, ind=c(1:3))
## EXAMPLE 2 ######################################
##
## Not run:
library(sparseLDA)
data(penicilliumYES)
## for demonstration only:
set.seed(5008642)
X < penicilliumYES$X[, sample(1:ncol(penicilliumYES$X), 100)]
## takes a subsample of the variables
## to have quicker computation time
colnames(X) < paste0("V",1:ncol(X))
y < as.factor(c(rep(1,12), rep(2,12), rep(3,12)))
set.seed(5008642)
mod < SosDiscRobust(X, y, lambda=1, maxit=5, Q=2, tol=1e2)
summary(mod)
plot(mod)
## End(Not run)

Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.