PCOutlierDetection: Principal Component Outlier Detection(Intersection of all the...

Description Usage Arguments Details Value Author(s) Examples

Description

Takes a dataset, and finds its outliers based on principal components using combination of different method

Usage

1
2
3
4
PCOutlierDetection(x, k = 0.05 * nrow(x), cutoff = 0.95,
  Method = "euclidean", rnames = FALSE, depth = FALSE,
  dense = FALSE, distance = FALSE, dispersion = FALSE,
  infocut = 0.9)

Arguments

x

dataset for which outliers are to be found

k

No. of nearest neighbours to be used for for outlier detection using bootstrapping, default value is 0.05*nrow(x)

cutoff

Percentile threshold used for distance, default value is 0.95

Method

Distance method, default is Euclidean

rnames

Logical value indicating whether the dataset has rownames, default value is False

depth

Logical value indicating whether depth based method should be used or not, default is False

dense

Logical value indicating whether density based method should be used or not, default is False

distance

Logical value indicating whether distance based methods should be used or not, default is False

dispersion

Logical value indicating whether dispersion based methods should be used or not, default is False

infocut

Amount of variation for deciding the no. of principal components to be retained in the analysis, default is 0.9

Details

OutlierDetection finds outlier observations for the principal component space using different methods and based on all the methods considered, labels an observation as outlier(intersection of all the methods). For bivariate data, it also shows the scatterplot of the data with labelled outliers.

Value

Outlier Observations: A matrix of outlier observations

Location of Outlier: Vector of Sr. no. of outliers

Author(s)

Vinay Tiwari, Akanksha Kashikar

Examples

1

Example output

Warning messages:
1: In rgl.init(initValue, onlyNULL) : RGL: unable to open X11 display
2: 'rgl.init' failed, running with 'rgl.useNULL = TRUE'. 
$`Outlier Observations`
        Comp.1       Comp.2     Comp.3
16  -2.3860390  1.338062330 -0.2777769
101  2.5311927 -0.009849109 -0.7601654
107  0.5212322 -1.192758727 -0.5456593
118  3.4870554  1.175739330 -0.1338949
123  3.4999200  0.460674099  0.5731822
132  3.2306737  1.374165087  0.1145482
137  2.1442433  0.140064201 -0.7348789

$`Location of Outlier`
[1]  16 101 107 118 123 132 137

$`3Dplot`

Warning message:
`arrange_()` is deprecated as of dplyr 0.7.0.
Please use `arrange()` instead.
See vignette('programming') for more help
This warning is displayed once every 8 hours.
Call `lifecycle::last_warnings()` to see where this warning was generated. 

OutlierDetection documentation built on June 16, 2019, 1:03 a.m.