cplm_th: Multiple change-point detection in a continuous,... In IDetect: Isolate-Detect Methodology for Multiple Change-Point Detection

Description

This function performs the Isolate-Detect methodology with the thresholding-based stopping rule in order to detect multiple change-points in a continuous, piecewise-linear noisy data sequence, with noise that is Gaussian. See Details for a brief explanation of the Isolate-Detect methodology (with the relevant reference) and of the thresholding-based stopping rule.

Usage

 1 2 3 cplm_th(x, sigma = stats::mad(diff(diff(x)))/sqrt(6), thr_const = 1.4, thr_fin = sigma * thr_const * sqrt(2 * log(length(x))), s = 1, e = length(x), points = 3, k_l = 1, k_r = 1) 

Arguments

 x A numeric vector containing the data in which you would like to find change-points. sigma A positive real number. It is the estimate of the standard deviation of the noise in x. The default value is mad(diff(diff(x)))/sqrt(6), where mad(x) denotes the median absolute deviation of x computed under the assumption that the noise is independent and identically distributed from the Gaussian distribution. thr_const A positive real number with default value equal to 1.4. It is used to define the threshold; see thr_fin. thr_fin With T the length of the data sequence, this is a positive real number with default value equal to sigma * thr_const * sqrt(2 * log(T)). It is the threshold, which is used in the detection process. s, e Positive integers with s less than e, which indicate that you want to check for change-points in the data sequence with subscripts in [s,e]. The default values are s equal to 1 and e equal to T, with T the length of the data sequence. points A positive integer with default value equal to 3. It defines the distance between two consecutive end- or start-points of the right- or left-expanding intervals, respectively; see Details for more information. k_l, k_r Positive integer numbers that get updated whenever the function calls itself during the detection process. They are not essential for the function to work, and we include them only to reduce the computational time.

Details

The change-point detection algorithm that is used in cplm_th is the Isolate-Detect methodology described in “Detecting multiple generalized change-points by isolating single ones”, Anastasiou and Fryzlewicz (2018), preprint. The concept is simple and is split into two stages; firstly, isolation of each of the true change-points in subintervals of the data domain, and secondly their detection. ID first creates two ordered sets of K = \lceil T/\code{points}\rceil right- and left-expanding intervals as follows. The j^{th} right-expanding interval is R_j = [1, j\times \code{points}], while the j^{th} left-expanding interval is L_j = [T - j\times \code{points} + 1, T]. We collect these intervals in the ordered set S_{RL} = \lbrace R_1, L_1, R_2, L_2, ... , R_K, L_K\rbrace. For a suitably chosen contrast function, ID first identifies the point with the maximum contrast value in R_1. If its value exceeds a certain threshold, then it is taken as a change-point. If not, then the process tests the next interval in S_{RL} and repeats the above process. Upon detection, the algorithm makes a new start from estimated location.

Value

A numeric vector with the detected change-points.

Author(s)

Andreas Anastasiou, a.anastasiou@lse.ac.uk

win_cplm_th, ID_cplm, and ID, which employ this function. In addition, see pcm_th for the case of detecting changes in a piecewise-constant signal via thresholding.
  1 2 3 4 5 6 7 8 9 10 11 single.cpt <- c(seq(0, 999, 1), seq(998.5, 499, -0.5)) single.cpt.noise <- single.cpt + rnorm(2000) cpt.single.th <- cplm_th(single.cpt.noise) three.cpt <- c(seq(0, 499, 1), seq(498.5, 249, -0.5), seq(251,1249,2), seq(1248,749,-1)) three.cpt.noise <- three.cpt + rnorm(2000) cpt.three.th <- cplm_th(three.cpt.noise) multi.cpt <- rep(c(seq(0,49,1), seq(48,0,-1)),20) multi.cpt.noise <- multi.cpt + rnorm(1980) cpt.multi.th <- cplm_th(multi.cpt.noise)