seqOpenEndCpDist: Open-end Nonparametric Sequential Change-Point Detection Test...

seqOpenEndCpDistR Documentation

Open-end Nonparametric Sequential Change-Point Detection Test for (Possibly) Multivariate Time Series Sensitive to Changes in the Distribution Function

Description

Open-end nonparametric sequential test for change-point detection based on a retrospective CUSUM statistic constructed from differences of empirical distribution functions. The observations can be univariate or multivariate (low-dimensional), and serially dependent. To carry out the test, two steps are required. The first step consists of computing a detector function. The second step consists of comparing the detector function to a suitable constant threshold function. Each of these steps corresponds to one of the functions in the usage section below. The current implementation is preliminary and not optimized for real-time monitoring (but could still be used for that). Details can be found in the first reference.

Usage

detOpenEndCpDist(x.learn, x, pts = NULL, r = NULL, sigma = NULL, kappa = 1.5, ...)

monOpenEndCpDist(det, alpha = 0.05, plot = TRUE)

Arguments

x.learn

a numeric matrix representing the learning sample.

x

a numeric matrix representing the observations collected after the beginning of the monitoring.

pts

a numeric matrix whose rows represent the evaluation points; if not provided by user, chosen automatically from the learning sample using parameter r.

r

integer greater or equal than 2 representing the number of evaluation points per dimension to be chosen from the learning sample; used only if pts = NULL.

sigma

a numeric matrix representing the covariance matrix to be used; if NULL, estimated by sandwich::lrvar().

kappa

constant involved in the point selection procedure; used only if the multivariate case; should be larger than 1.

...

optional arguments passed to sandwich::lrvar().

det

an object of class det.OpenEndCpDist representing a detector function computed using detOpenEndCpDist().

alpha

the value of the desired significance level for the sequential test.

plot

logical indicating whether the monitoring should be plotted.

Details

The testing procedure is described in detail in the first reference.

Value

Both functions return lists whose components have explicit names. The function monOpenEndCpDist() in particular returns a list whose components are

alarm

a logical indicating whether the detector function has exceeded the threshold function.

time.alarm

an integer corresponding to the time at which the detector function has exceeded the threshold function or NA.

times.max

a vector of times at which the successive detectors have reached their maximum; this sequence of times can be used to estimate the time of change from the time of alarm.

time.change

an integer giving the estimated time of change if alarm is TRUE; the latter is simply the value in times.max which corresponds to time.alarm.

statistic

the value of statistic in the call of the function.

eta

the value of eta in the call of the function.

p

number of evaluations points of the empirical distribution functions.

pts

evaluation points of the empirical distribution functions.

alpha

the value of alpha in the call of the function.

sigma

the value of sigma in the call of the function.

detector

the successive values of the detector.

threshold

the value of the constant threshold for the detector.

References

M. Holmes, I. Kojadinovic and A. Verhoijsen (2022), Multi-purpose open-end monitoring procedures for multivariate observations based on the empirical distribution function, 45 pages, https://arxiv.org/abs/2201.10311.

M. Holmes and I. Kojadinovic (2021), Open-end nonparametric sequential change-point detection based on the retrospective CUSUM statistic, Electronic Journal of Statistics 15:1, pages 2288-2335, \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1214/21-EJS1840")}.

See Also

See detOpenEndCpMean() for the corresponding test sensitive to changes in the mean, selectPoints() for the underlying point selection procedure used in the multivariate case and lrvar() for information on the estimation of the underlying long-run covariance matrix.

Examples

## Not run: 
## Example of open-end monitoring
m <- 800 # size of the learning sample
nm <- 5000 # number of collected observations after the start
n <- nm + m # total number of observations

set.seed(456)

## Univariate, no change in distribution
r <- 5 # number of evaluation points
x <- rnorm(n)
## Step 1: Compute the detector
det <- detOpenEndCpDist(x.learn = matrix(x[1:m]),
                        x = matrix(x[(m + 1):n]), r = r)
## Step 2: Monitoring
mon <- monOpenEndCpDist(det = det, alpha = 0.05, plot = TRUE)

## Univariate, change in distribution
k <- 2000 # m + k + 1 is the time of change
x[(m + k + 1):n] <- rt(nm - k, df = 3)
det <- detOpenEndCpDist(x.learn = matrix(x[1:m]),
                        x = matrix(x[(m + 1):n]), r = r)
mon <- monOpenEndCpDist(det = det, alpha = 0.05, plot = TRUE)

## Bivariate, no change
d <- 2
r <- 4 # number of evaluation points per dimension
x <- matrix(rnorm(n * d), nrow = n, ncol = d)
det <- detOpenEndCpDist(x.learn = x[1:m, ], x = x[(m + 1):n, ], r = r)
mon <- monOpenEndCpDist(det = det, alpha = 0.05, plot = TRUE)

## Bivariate, change in the mean of the first margin
x[(m + k + 1):n, 1] <- x[(m + k + 1):n, 1] + 0.3
det <- detOpenEndCpDist(x.learn = x[1:m, ], x = x[(m + 1):n, ], r = r)
mon <- monOpenEndCpDist(det = det, alpha = 0.05, plot = TRUE)

## Bivariate, change in the dependence structure
x1 <- rnorm(n)
x2 <- c(rnorm(m + k, 0.2 * x1[1:(m + k)], sqrt((1 - 0.2^2))),
        rnorm(nm - k, 0.7 * x1[(m + k + 1):n], sqrt((1 - 0.7^2))))
x <- cbind(x1, x2)
det <- detOpenEndCpDist(x.learn = x[1:m, ], x = x[(m + 1):n, ], r = r)
mon <- monOpenEndCpDist(det = det, alpha = 0.05, plot = TRUE)

## End(Not run)

npcp documentation built on Oct. 18, 2024, 9:06 a.m.

Related to seqOpenEndCpDist in npcp...