Description Usage Arguments Details Value Author(s) References Examples
Takes a dataset and finds its outliers using Robust Kernal-based Outlier Factor(RKOF) algorithm
1 2 |
x |
dataset for which outliers are to be found |
k |
No. of nearest neighbours to be used, default value is 0.05*nrow(x) |
C |
Multiplication parameter for k-distance of neighboring observations. Act as bandwidth increaser. Default is 1 such that k-distance is used for the gaussian kernel |
alpha |
Sensivity parameter for k-distance/bandwidth. Small alpha creates small variance in RKOF and vice versa. Default is 1 |
sigma2 |
Variance parameter for weighting of neighboring observations |
cutoff |
Percentile threshold used for distance, default value is 0.95 |
rnames |
Logical value indicating whether the dataset has rownames, default value is False |
boottimes |
Number of bootsrap samples to find the cutoff, default is 100 samples |
dens computes outlier score of an observation using DDoutlier package(based on RKOF algorithm) and based on the bootstrapped cutoff, labels an observation as outlier. Outlierliness of the labelled 'Outlier' is also reported and it is the bootstrap estimate of probability of the observation being an outlier. For bivariate data, it also shows the scatterplot of the data with labelled outliers.
Outlier Observations: A matrix of outlier observations
Location of Outlier: Vector of Sr. no. of outliers
Outlier probability: Vector of proportion of times an outlier exceeds local bootstrap cutoff
Vinay Tiwari, Akanksha Kashikar
Ester, M., Kriegel, H.-P., Sander, J., and Xu, X. 1996. A density-based algorithm for discovering clusters in large spatial databases with noise. In Proc. Int. Conf. on Knowledge Discovery and Data Mining (KDD), Portland, OR.
1 2 3 4 |
Warning messages:
1: In rgl.init(initValue, onlyNULL) : RGL: unable to open X11 display
2: 'rgl.init' failed, running with 'rgl.useNULL = TRUE'.
$`Outlier Observations`
Sepal.Length Sepal.Width Petal.Length Petal.Width
23 4.6 3.6 1.0 0.2
42 4.5 2.3 1.3 0.3
63 6.0 2.2 4.0 1.0
107 4.9 2.5 4.5 1.7
110 7.2 3.6 6.1 2.5
$`Location of Outlier`
[1] 23 42 63 107 110
$`Outlier Probability`
[1] 1.00 1.00 0.98 0.96 1.00
$`3Dplot`
Warning messages:
1: `arrange_()` is deprecated as of dplyr 0.7.0.
Please use `arrange()` instead.
See vignette('programming') for more help
This warning is displayed once every 8 hours.
Call `lifecycle::last_warnings()` to see where this warning was generated.
2: `line.width` does not currently support multiple values.
3: `line.width` does not currently support multiple values.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.