maxEnergyCPMv: Nonparametric Multivariate Control Chart based on Energy Test

Description Usage Arguments Details Value Author(s) References Examples

View source: R/maxEnergyCPMv.r

Description

This R function centers on nonparametric Phase II multiple change points detection for high dimensional time series. Three highlights are included in the function. Firstly, the new model is nonparametric which does not require any distributional pre-knowledge about the process. The test is based on the maximum energy statistic (see [1] Gabor J. Szekely and Maria L. Rizzo 2004, Testing for Equal Distributions in High Dimension) and permutation samples. Secondly, the model is a Phase II change point model (see [2]) which is used for online detection of stream data not for batch data. Phase II set-up has practical meaning in time series change detection. Thirdly, it is concentrated on high dimensional data, i.e. multivariate context. An important remark is that the data used in this function must be independent, i.e. every row in the N*d matrix must be an independent observation. If your data set contains not-independent observations then you need to handle the data using some filter functions, e.g. multivariate time series model to filter out the residuals which are theoretically independent.

Usage

1
maxEnergyCPMv(data1, wNr, permNr, alpha)

Arguments

data1

an N*d matrix, N is the number of observations and d the dimensions.

wNr

a scalar of warm-up.

permNr

a scalar of times of permutation.

alpha

a scalar of significant level

Details

The function returns ONLY ONE vector containing even number components, where the first half stands for detection time vector and the rest half stands for the vector of change time locations.

Value

result

a vector of locations of detection time in the first half, locations of change time in the second half.

Author(s)

Yafei Xu <yafei.xu@hotmail.de>

References

[1] Szekely, G. J. & Rizzo, M. L. (2004). Testing for equal distributions in high dimension, InterStat.

[2] Hawkins, D. M., Qiu, P. & Kang, C. W. (2003). The changepoint model for statistical process control, Journal of Quality Technology.

[3] Xu, Y. F. (2017). Reference manual: An R package "EnergyOnlineCPM".

see URL: https://sites.google.com/site/EnergyOnlineCPM/

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
# simulate 300 length time series
simNr=300

# simulate 300 length 5 dimensonal standard Gaussian series
Sigma2 <- matrix(c(1,0,0,0,0,  0,1,0,0,0,   0,0,1,0,0,  0,0,0,1,0,  0,0,0,0,1),5,5)
Mean2=rep(1,5)
sim2=(mvrnorm(n = simNr, Mean2, Sigma2)) 

# simulate 300 length 5 dimensonal standard Gaussian series
Sigma3 <- matrix(c(1,0,0,0,0,  0,1,0,0,0,   0,0,1,0,0,  0,0,0,1,0,  0,0,0,0,1),5,5)
Mean3=rep(0,5)
sim3=(mvrnorm(n = simNr, Mean3, Sigma3))

# construct a data set of length equal to 35.
# first 20 points are from standard Gaussian. 
# second 15 points from a Gaussian with a mean shift with 555.
data1=sim6=rbind(sim2[1:20,],(sim3+555)[1:15,])

# set warm-up number as 20, permutation 200 times, significant level 0.005
wNr=20
permNr=200
alpha=1/200
maxEnergyCPMv(data1,wNr,permNr,alpha)

EnergyOnlineCPM documentation built on May 1, 2019, 8:45 p.m.