#' @title Check Whether Test Point is Anomaly
#'
#' @description Assume the \code{training} set was generated by a process that
#' follows t-distribution with \code{degF} degrees of freedom, this function
#' checks whether to reject the null hypothesis that the \code{test} point are
#' generated by the same process. If the probability of obtaining a result
#' equals to or more extreme than \code{test} is lower than \code{p}, then
#' this function returns \code{TRUE}, meaning the null hypothesis is rejected
#' and the \code{test} point is likely to be an anomoly. If argument
#' \code{exclude} is specified, elements at those designated positions are
#' removed from the training set.
#'
#' @param training A numeric vector containing the samples used to fit the
#' t-distribution
#' @param test A numeric value to be tested
#' @param exclude A logical vector with length equals to
#' \code{length(training)}. It is used to remove elements at designated
#' positions from fitting the t-distribution. By default, \code{exclude =
#' NULL}, which means no element is excluded when fitting the t-distribution.
#' @param p p-value threshold with values in \emph{[0, 1]}.
#' @param degF Degrees of freedom (>0, maybe non-integer)
#' @return returns \code{TRUE} if the test point is likely to be an anomaly and
#' \code{FALSE} otherwise. For debugging purpose, this function also returns
#' metadata \code{stdev} and \code{tscore}, which equals to the sample
#' standard deviation calculated from \code{training} set and t-score of
#' \code{test} point, respectively.
#' @examples
#' set.seed(1)
#' training <- runif(1000)
#' test <- 0.95
#' exclude <- sample(c(T, F), 1000, replace = T, prob = c(0.005, 0.995))
#' TestPointIsAnomaly_TDist(training, test, exclude)
#' TestPointIsAnomaly_TDist(training, test, exclude, 0.1)
#' @importFrom stats qt
#' @export
#'
TestPointIsAnomaly_TDist <- function(training, test, exclude = NULL, p = 0.01, degF = 10){
if(!is.null(exclude)){
training <- training[!exclude]
}
sd <- sd(training)
tscore <- (test - mean(training))/sd
tT <- qt(1-p, degF)
isAnomaly <- (tscore > tT)
attr(isAnomaly, "stdev") <- sd
attr(isAnomaly, "tscore") <- tscore
isAnomaly
}
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.