TestPointIsAnomaly_TDist: Check Whether Test Point is Anomaly

Description Usage Arguments Value Examples

View source: R/TestPointIsAnomaly_TDist.R

Description

Assume the training set was generated by a process that follows t-distribution with degF degrees of freedom, this function checks whether to reject the null hypothesis that the test point are generated by the same process. If the probability of obtaining a result equals to or more extreme than test is lower than p, then this function returns TRUE, meaning the null hypothesis is rejected and the test point is likely to be an anomoly. If argument exclude is specified, elements at those designated positions are removed from the training set.

Usage

1
2
TestPointIsAnomaly_TDist(training, test, exclude = NULL, p = 0.01,
  degF = 10)

Arguments

training

A numeric vector containing the samples used to fit the t-distribution

test

A numeric value to be tested

exclude

A logical vector with length equals to length(training). It is used to remove elements at designated positions from fitting the t-distribution. By default, exclude = NULL, which means no element is excluded when fitting the t-distribution.

p

p-value threshold with values in [0, 1].

degF

Degrees of freedom (>0, maybe non-integer)

Value

returns TRUE if the test point is likely to be an anomaly and FALSE otherwise. For debugging purpose, this function also returns metadata stdev and tscore, which equals to the sample standard deviation calculated from training set and t-score of test point, respectively.

Examples

1
2
3
4
5
6
set.seed(1)
training <- runif(1000)
test <- 0.95
exclude <- sample(c(T, F), 1000, replace = T, prob = c(0.005, 0.995))
TestPointIsAnomaly_TDist(training, test, exclude)
TestPointIsAnomaly_TDist(training, test, exclude, 0.1)

jingjin1018/anetimeseries documentation built on May 19, 2019, 10:35 a.m.