MLE_window_lth: Scan Statistics Empirical Window Length

Description Usage Arguments Details Value Note Author(s) Examples

View source: R/MLE_window_lth.R

Description

This funtion returns the optimized scan statistics window length via the empirical estimation.

Usage

1
MLE_window_lth(x,dist_null,..., unit = 1)

Arguments

x

a numeric vector of data values.

dist_null

a character string giving the underlying distribution in null hypothesis. Distribution options are shown in details.

...

Further arguments for distribution parameters.

unit

A number indicating the bin width for counting excess.

Details

Before applying scan statistics, the window length need to be setted first and this is an important factor which determines the hypothesis test performance. In practice, window length should be close to the cluster size. Too small window length leads to higher false positive while too large window length leads to lower test power.

This function is for efficiently select an appropriate window length. The data is splited by unit and in each group the excess is defined as number of observations subtract with expected observations. Then, the maximum excess among those groups is the returned value.

The dist_null indicates the underlying distribution class. The options follow the distributions regular abbreviation in R, Like norm is normal distribution, unif is uniform distribution, gpd is generalized pareto distribution. Distributions for more distribution options.

Value

The empirical scan statistics window length is returned.

Note

To use gpd, package POT https://cran.r-project.org/package=POT needs to be installed first.

Author(s)

Zhicong Zhao

Examples

1
2
x <- c(rnorm(1000),rep(1,20)) ## sampled data
wd_lth <- MLE_window_lth(x,dist_null = "norm",mean = 0,sd = 1,unit = 0.01)

zhicongz/AnomDetct documentation built on Dec. 12, 2019, 9:16 a.m.