Description Usage Arguments Value References Examples

View source: R/find_HDoutliers.R

Detect anomalies in high dimensional data. This is a modification of
`HDoutliers`

.

1 2 3 4 5 6 7 | ```
find_HDoutliers(
data,
alpha = 0.01,
k = 10,
knnsearchtype = "brute",
normalize = "unitize"
)
``` |

`data` |
A vector, matrix, or data frame consisting of numerical variables. |

`alpha` |
Threshold for determining the cutoff for outliers. Observations are considered
outliers if they fall in the |

`k` |
Number of neighbours considered. |

`knnsearchtype` |
A character vector indicating the search type for k- nearest-neighbors. |

`normalize` |
Method to normalize the columns of the data. This prevents variables with large variances having disproportional influence on Euclidean distances. Two options are available "standardize" or "unitize". Default is set to "unitize" |

The indexes of the observations determined to be outliers.

Wilkinson, L. (2018), 'Visualizing big data outliers through distributed aggregation', IEEE transactions on visualization and computer graphics 24(1), 256-266.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 | ```
require(ggplot2)
set.seed(1234)
data <- c(rnorm(1000, mean = -6), 0, rnorm(1000, mean = 6))
outliers <- find_HDoutliers(data, knnsearchtype = "kd_tree")
set.seed(1234)
n <- 1000 # number of observations
nout <- 10 # number of outliers
typical_data <- matrix(rnorm(2 * n), ncol = 2, byrow = TRUE)
out <- matrix(5 * runif(2 * nout, min = -5, max = 5), ncol = 2, byrow = TRUE)
data <- rbind(out, typical_data)
outliers <- find_HDoutliers(data, knnsearchtype = "brute")
``` |

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.