Robust and sparse multigroup classification by the optimal scoring approach.
The class `SosDisc`

searves as a base class for deriving all other
classes representing the results of the robust and sparse multigroup classification
by the optimal scoring approach.

The sparse optimal scoring problem (Clemmensen et al, 2011):
for *h=1,....,Q*

*
min{β_h,θ_h} 1/n ||Y θ_h - X β_h ||_2^2 + λ ||β_h||_1
*

subject to

*
1/n θ_h^T Y^T Y θ_h = 1, θ_h^T Y^T Y θ_l = 0 for all l<h,
*

where *X* deontes the robustly centered and scaled input matrix `x`

(or alternativly the predictors from `formular`

) and *Y* is an dummy matrix coding die classmemberships from `grouping`

.

For each *h* this problem can be solved interatively for *β_h* and *θ_h*. In order to obtain robust estimates, *β_h* is estimated with reweighted sparse least trimmed squares regression (Alfons et al, 2013) and *θ_h* with least absolut deviation regression in the first two iterations. To speed up the following repetitions an iterative down-weighting of observations with large residuals is combined with the iterative estimation of the optimal scoring coefficients with their classical estimates.

The classification model is estimated on the low dimensional sparse subspace *X[β_1,...,β_Q]* with robust LDA (`Linda`

).

A virtual Class: No objects may be created from it.

`call`

:The (matched) function call.

`prior`

:Prior probabilities; same as input parameter.

`counts`

:Number of observations in each class.

`beta`

:Object of class

`"matrix"`

: Q coefficient vectors of the predictor matrix from optimal scoring (see Details); rows corespond to variables listed in`varnames`

.`theta`

:Object of class

`"matrix"`

: Q coefficient vectors of the dummy matrix for class coding from optimal scoring (see Details).`lambda`

:Non-negative tuning paramer from L1 norm penaly; same as input parameter

`varnames`

:Character vector: Names of included predictor variables (variables where at least one beta coefficient is non-zero).

`center`

:Centering vector of the input predictors (coordinate wise median).

`scale`

:Scaling vector of the input predictors (mad).

`fit`

:Object of class

`"Linda"`

: Linda model (robust LDA model) estimated in the low dimensional subspace*X[β_1,...,β_Q]*(see Details)`mahadist2`

:These will go later to Linda object: squared robust Mahalanobis distance (calculated with estimates from Linda, with common covariance structure of all groups) of each observation to its group center in the low dimensional subspace

*X[β_1,...,β_Q]*(see Details).`wlinda`

:These will go later to Linda object: 0-1 weights derived from

`mahadist2`

; observations where the squred robust Mahalanobis distance is larger than the 0.975 quantile of the chi-square distribution with Q degrees of freedom resive weight zero.`X`

:The training data set (same as the input parameter

`x`

of the constructor function)`grp`

:Grouping variable: a factor specifying the class for each observation (same as the input parameter

`grouping`

)

- predict
`signature(object = "SosDisc")`

: calculates prediction using the results in`object`

. An optional data frame or matrix in which to look for variables with which to predict. If omitted, the training data set is used. If the original fit used a formula or a data frame or a matrix with column names, newdata must contain columns with the same names.- show
`signature(object = "SosDisc")`

: prints the results- summary
`signature(object = "SosDisc")`

: prints summary information

Irene Hoffmann irene.hoffmann@tuwien.ac.at and Valentin Todorov valentin.todorov@chello.at

Clemmensen L, Hastie T, Witten D & Ersboll B (2012),
Sparse discriminant analysis.
*Technometrics*, **53**(4), 406–413.

Hoffmann I, Filzmoser P & Croux C (2016), Robust and sparse multigroup classification by the optimal scoring approach. Submitted for publication.

1 | ```
showClass("SosDisc")
``` |

Questions? Problems? Suggestions? Tweet to @rdrrHQ or email at ian@mutexlabs.com.

Please suggest features or report bugs with the GitHub issue tracker.

All documentation is copyright its authors; we didn't write any of that.