threshold_finder: Estimating the threshold for tpca changepoint detection

Description Usage Arguments Details Value

Description

Description

Usage

1
2
3
threshold_finder(x, mon_type, n, alpha, axes = NULL, p0 = NULL,
  w = 200, rel_tol = c(0.2, 0.1, 0.05, 0.025), thresh_alpha = 0.05,
  init_thresh = NULL, learning_coef = NULL, file_id = NULL)

Arguments

x

The d x m training data matrix, where d is the dimension of the data and m the number of training samples

mon_type

Character string signifying which type of monitoring statistic to find the threshold for. Available options: "tpca" and "mixture".

n

The length of the segment to monitor for false alarms. See details.

alpha

Probability of type I error (false alarm) within the time window . See details.

axes

Indices of the principal axes to be used in simulations. MUST be supplied if mon_type == 'tpca'.

p0

The mixture probability of a dimension being affected by a change. MUST be supplied if mon_type == 'mixture'.

w

The window size (an integer).

rel_tol

A vector with the sequence of relative error tolerances allowed at each step towards convergence in the algorithm. See details.

thresh_alpha

1 - thresh_alpha is the confidence level. See details.

init_thresh

Use if a custom initial value for the threshold is wanted.

learning_coef

The learning rate is defined as rel_tol / learning_coef. The default learning_coef is 3, which has been found a good choice after a lot of experimenting.

file_id

A string that identifies the files associated with the particular threshold beyond the data dimension d, axes/p0, training size m or probability of false alarm alpha.

Details

n and alpha governs the false alarm rate by the relation P(T < n | H_0) <= alpha. The corresponding average run length (arl) is approximately given by n / alpha.

rel_tol and thresh_alpha governs the number of simulations used in each step of the algorithm towards a more and more certain estimate. At each step, the number of simulations is chosen so that [(1 - rel_tol)arl, (1 + rel_tol)arl] approximately covers the true average run length at confidence level thresh_alpha. For example, when the last rel_tol is 0.025, it means that the final estimated threshold corresponds to an average run length of approximately arl +- 0.025 * arl at confidence level thresh_alpha. The algorithm should start with a large relative error tolerance, and then narrow it down for the quickest convergence.

Value

A list with the following components:

threshold

The final estimate

tpca_threshold also creates a .txt file with each line showing summaries of each step in the estimation procedure. The last line corresponds to the final estimate.


Tveten/tpcaMonitoring documentation built on June 4, 2019, 4:04 p.m.