Description Usage Arguments Details Value Author(s) References Examples
View source: R/pguOutlierDetection.R
Outlier detection using kth Nearest Neighbour Distance method Takes a dataset and finds its outliers using distance-based method
1 2 3 4 5 6 7 8 |
x |
dataset for which outliers are to be found |
k |
No. of nearest neighbours to be used, default value is 0.05*nrow(x) |
cutoff |
Percentile threshold used for distance, default value is 0.95 |
Method |
Distance method, default is Euclidean |
rnames |
Logical value indicating whether the dataset has rownames, default value is False |
boottimes |
Number of bootsrap samples to find the cutoff, default is 100 samples |
nnk computes kth nearest neighbour distance of an observation and based on the bootstrapped cutoff, labels an observation as outlier. Outlierliness of the labelled 'Outlier' is also reported and it is the bootstrap estimate of probability of the observation being an outlier. For bivariate data, it also shows the scatterplot of the data with labelled outliers.
Outlier Observations: A matrix of outlier observations
Location of Outlier: Vector of Sr. no. of outliers
Outlier probability: Vector of proportion of times an outlier exceeds local bootstrap cutof
Vinay Tiwari, Akanksha Kashikar
Hautamaki, V., Karkkainen, I., and Franti, P. 2004. Outlier detection using k-nearest neighbour graph. In Proc. IEEE Int. Conf. on Pattern Recognition (ICPR), Cambridge, UK.
1 2 3 4 |
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.