Description Usage Arguments Value Examples
View source: R/TestPointIsAnomaly.R
Assume the training
set was generated by a process that
follows certain distribution, i.e. "normal" or "t", this function checks
whether to reject the null hypothesis that the test
point are
generated by the same process. If the probability of obtaining a result
equals to or more extreme than test
is lower than p
, then
this function returns TRUE
, meaning the null hypothesis is rejected
and the test
point is likely to be an anomoly. If argument
exclude
is specified, elements at those designated positions are
removed from the training set.
1 2 | TestPointIsAnomaly(training, test, dist = "t", exclude = NULL, p = 0.01,
direction = direction, ...)
|
training |
A numeric vector containing the samples used to fit the distribution |
test |
A numeric value to be tested |
dist |
A string specifies the distribution to fit. Options are
t-distribution(default: |
exclude |
A logical vector with length equals to
|
p |
p-value threshold with values in [0, 1]. |
direction |
Directionality of the anomalies to be deteted. Options are: 'pos', 'neg' and 'both'. Defaults to be 'both'. |
returns TRUE
if the test point is likely to be an anomaly and
FALSE
otherwise. For debugging purpose, this function also returns
metadata median
, mad
, score
and
hist
.
median
sample median of training set
mad
sample mean absolute deviation calculated from training
set
score
defined as (test - median) / mad
hist
histogram of training points overlaid with stats function used
to fit
1 2 3 4 5 6 7 8 9 | set.seed(1)
training <- rt(1000, df = 10)
test <- 4
exclude <- sample(c(T, F), 1000, replace = T, prob = c(0.005, 0.995))
r <- TestPointIsAnomaly(training, test, dist = "t", exclude = exclude, p = 0.1, df = 10)
attr(r, "hist")
training <- rnorm(1000, mean = 0, sd = 10)
r <- TestPointIsAnomaly(training, test, dist = "normal", p = 0.1)
attr(r, "hist")
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.