changepoint: Detection of a sustained change-point in univariate and...

View source: R/glr.R

changepointR Documentation

Detection of a sustained change-point in univariate and multivariate data

Description

changepoint (univariate data) and mchangepoint (multivariate data) test for the presence of a sustained location and/or dispersion shift. Both functions can be applied to individual and subgrouped observations.

changepoint.normal.limits and mchangepoint.normal.limits precompute the corresponding control limits when the in-control distribution is normal.

Usage

changepoint(x, subset, score = c("Identity", "Ranks"), only.mean = FALSE,
  plot = TRUE, FAP = 0.05, seed = 11642257, L = 10000, limits = NA)

mchangepoint(x, subset, score = c("Identity", "Signed Ranks", "Spatial Signs",
  "Spatial Ranks", "Marginal Ranks"), only.mean = FALSE,
  plot = TRUE, FAP = 0.05, seed = 11642257, L = 10000, limits = NA) 

changepoint.normal.limits(n, m, score = c("Identity", "Ranks"),
  only.mean = FALSE, FAP = 0.05, seed = 11642257, L = 100000)

mchangepoint.normal.limits(p, n, m, score = c("Identity", "Signed Ranks", "Spatial Signs",
  "Spatial Ranks", "Marginal Ranks"), only.mean = FALSE,
  FAP = 0.05, seed = 11642257, L = 100000)

Arguments

x

changepoint: a nxm numeric matrix or a numeric vector of length m.

mchangepoint: a pxnxm data numeric array or a pxm numeric vector.

See below, for the meaning of p, n and m.

p

integer: number of monitored variables.

n

integer: size of each subgroup (number of observations gathered at each time point).

m

integer: number of subgroups (time points).

subset

an optional vector specifying a subset of subgroups/time points to be used

score

character: the transformation to use; see mshewhart.

only.mean

logical; if TRUE only a location change-point is searched.

plot

logical; if TRUE, the control statistic is displayed.

FAP

numeric (between 0 and 1): the desired false alarm probability.

seed

positive integer; if not NA, the RNG's state is resetted using seed. The current .Random.seed will be preserved. Unused by mshewhart when limits is not NA.

L

positive integer: the number of Monte Carlo replications used to compute the control limits. Unused by changepoint and mchangepoint when limits is not NA.

limits

numeric: a precomputed vector of length m containing the control limits.

Details

After an optional rank transformation (argument score), changepoint and mchangepoint compute, for \tau=2,\ldots,m, the normal likelihood ratio test statistics for verifying whether the mean and dispersion (or only the mean when only.mean=TRUE) are the same before and after \tau. See Sullivan and Woodall (1999, 2000) and Qiu (2013), Chapter 6 and Section 7.5.

Note that the control statistic is equivalent to that proposed by Lung-Yut-Fong et al. (2011) when score="Marginal Ranks" and only.mean=TRUE.

As suggested by Sullivan and Woodall (1999, 2000), control limits proportional to the in-control mean of the likelihood ratio test statistics are used. Further, when plot=TRUE, the control statistics divided by the time-varying control limits are plotted with a “pseudo-limit” equal to one.

When only.mean=FALSE, the decomposition of the likelihood ratio test statistic suggested by Sullivan and Woodall (1999, 2000) for diagnostic purposes is also computed, and optionally plotted.

Value

changepoint and mchangepoint return an invisible list with elements

glr

control statistics.

mean, dispersion

decomposition of the control statistics in the two parts due to changes in the mean and dispersion, respectively. These elements are present only when only.mean=FALSE.

limits

control limits.

score, only.mean, FAP, L, seed

input arguments.

changepoint.normal.limits and mchangepoint.normal.limits return a numeric vector containing the control limits.

Note

  1. When limits is NA, changepoint and mchangepoint compute the control limits by permutation. The resulting control charts are distribution-free.

  2. Pre-computed limits, like those computed using changepoint.normal.limits and mchangepoint.normal.limits, are recommended only for univariate data when score=Ranks. Indeed, in all the other cases, the resulting control chart will not be distribution-free.

  3. However, note that, when score is Signed Ranks, Spatial Signs, Spatial Ranks the normal-based control limits are distribution-free in the class of all multivariate elliptical distributions.

Author(s)

Giovanna Capizzi and Guido Masarotto.

References

A. Lung-Yut-Fong, C. Lévy-Leduc, O. Cappé O (2011) “Homogeneity and change-point detection tests for multivariate data using rank statistics”. arXiv:11071971, https://arxiv.org/abs/1107.1971.

P. Qiu (2013) Introduction to Statistical Process Control. Chapman & Hall/CRC Press.

J. H. Sullivan, W. H. Woodall (1996) “A control chart for preliminary analysis of individual observations”. Journal of Quality Technology, 28, pp. 265–278, \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1080/00224065.1996.11979677")}.

J. H. Sullivan, W. H. Woodall (2000) “Change-point detection of mean vector or covariance matrix shifts using multivariate individual observations”. IIE Transactions, 32, pp. 537–549 \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1080/07408170008963929")}.

Examples

data(gravel)
changepoint(gravel[1,,])
mchangepoint(gravel)
mchangepoint(gravel,score="Signed Ranks")

dfphase1 documentation built on July 9, 2023, 7:29 p.m.