# OutlierSign2: Outlier identification in high dimensions using the SIGN2... In rrcovHD: Robust Multivariate Methods for High Dimensional Data

## Description

Fast algorithm for identifying multivariate outliers in high-dimensional and/or large datasets, using spatial signs, see Filzmoser, Maronna, and Werner (CSDA, 2007). The computation of the distances is based on principal components.

## Usage

 ```1 2 3 4 5``` ``` OutlierSign2(x, ...) ## Default S3 method: OutlierSign2(x, grouping, qcrit = 0.975, explvar=0.99, trace=FALSE, ...) ## S3 method for class 'formula' OutlierSign2(formula, data, ..., subset, na.action) ```

## Arguments

 `formula` a formula with no response variable, referring only to numeric variables. `data` an optional data frame (or similar: see `model.frame`) containing the variables in the formula `formula`. `subset` an optional vector used to select rows (observations) of the data matrix `x`. `na.action` a function which indicates what should happen when the data contain `NA`s. The default is set by the `na.action` setting of `options`, and is `na.fail` if that is unset. The default is `na.omit`. `...` arguments passed to or from other methods. `x` a matrix or data frame. `grouping` grouping variable: a factor specifying the class for each observation. `explvar` a numeric value between 0 and 1 indicating how much variance should be covered by the robust PCs. Default is 0.99. `qcrit` a numeric value between 0 and 1 indicating the quantile to be used as critical value for outlier detection. Default is 0.975. `trace` whether to print intermediate results. Default is `trace = FALSE`

## Details

Based on the robustly sphered and normed data, robust principal components are computed which are needed for determining distances for each observation. The distances are transformed to approach chi-square distribution, and a critical value is then used as outlier cutoff.

## Value

An S4 object of class `OutlierSign2` which is a subclass of the virtual class `Outlier`.

## Author(s)

Valentin Todorov [email protected]

## References

P. Filzmoser, R. Maronna and M. Werner (2008), Outlier identification in high dimensions, Computational Statistics & Data Analysis, Vol. 52 1694–1711.

P. Filzmoser & V. Todorov (2012), Robust tools for the imperfect world, To appear.

`OutlierSign2`, `OutlierSign1`, `Outlier`
 ```1 2 3 4 5 6 7 8 9``` ```data(hemophilia) obj <- OutlierSign2(gr~.,data=hemophilia) obj getDistance(obj) # returns an array of distances getClassLabels(obj, 1) # returns an array of indices for a given class getCutoff(obj) # returns an array of cutoff values (for each class, usually equal) getFlag(obj) # returns an 0/1 array of flags plot(obj, class=2) # standard plot function ```