ID: Multiple change-point detection in the mean or the slope of a...

View source: R/Finalised_coding.R

IDR Documentation

Multiple change-point detection in the mean or the slope of a vector using the Isolate-Detect methodology

Description

This is the main, general function of the package. It employs more specialised functions in order to estimate the number and locations of multiple change-points in either piecewise-constant or piecewise-linear mean of a noisy input vector xd. The noise can either follow the Gaussian distribution or not. Further to the estimated change-points, ID, returns the estimated signal, as well as the solution path. For more information and the relevant literature reference, see Details.

Usage

ID(
  xd,
  th.cons = 1,
  th.cons_lin = 1.4,
  th.ic = 0.9,
  th.ic.lin = 1.25,
  lam = 3,
  lam.ic = 10,
  contrast = c("mean", "slope"),
  ht = FALSE,
  scale = 3
)

Arguments

xd

A numeric vector containing the data in which you would like to find change-points.

th.cons

A positive real number with default value equal to 1. It is used to define the threshold (if the thresholding approach is to be followed) in the scenario of piecewise-constant mean signals. In this case, the change-points are estimated by thresholding with threshold equal to sigma * th.cons * sqrt(2 * log(l)), where l is the length of the data sequence xd and sigma is equal to mad(diff(xd)/sqrt(2)).

th.cons_lin

A positive real number with default value equal to 1.4. It is used to define the threshold (if the thresholding approach is to be followed) in the scenario of piecewise-linear mean signals. In this case, the change-points are estimated by thresholding with threshold equal to sigma * th.cons_lin * sqrt(2 * log(l)), where l is the length of the data sequence xd and sigma is equal to mad(diff(diff(xd)))/sqrt(6).

th.ic

A positive real number with default value equal to 0.9. It is useful only if the model selection based Isolate-Detect method is to be followed for the scenario of piecewise-constant mean signals. It is used to define the threshold value that will be used at the first step (change-point overestimation) of the model selection approach.

th.ic.lin

A positive real number with default value equal to 1.25. It is useful only if the model selection based Isolate-Detect method is to be followed for the scenario of piecewise-linear mean signals. It is used to define the threshold value that will be used at the first step (change-point overestimation) of the model selection approach.

lam

A positive integer with default value equal to 3. It is used only when the threshold based approach is to be followed and it defines the distance between two consecutive end- or start-points of the right- or left-expanding intervals, respectively.

lam.ic

A positive integer with default value equal to 10. It is used only when the information criterion based approach is to be followed and it defines the distance between two consecutive end- or start-points of the right- or left-expanding intervals, respectively.

contrast

A character string, which defines the type of the contrast function to be used in the Isolate-Detect algorithm. If contrast = ``mean'', then the algorithm looks for changes in the mean of a piecewise-constant signal. If contrast = ``slope'', then the algorithm looks for changes in the slope of a piecewise-linear and continuous signal.

ht

A logical variable with default value equal to FALSE. If FALSE, the noise is assumed to follow the Gaussian distribution. If TRUE, then the noise is assumed to follow a distribution that has tails heavier than those of the Gaussian distribution.

scale

A positive integer number with default value equal to 3. It is used to define the way we pre-average the given data sequence only if ht = TRUE.

Details

The data points provided in xd are assumed to follow

X_t = f_t + \sigma\epsilon_t; t = 1,2,...,T,

where T is the total length of the data sequence, X_t are the observed data, f_t is an one-dimensional, deterministic signal with abrupt structural changes at certain points, and \epsilon_t are independent and identically distributed random variables with mean zero and variance equal to one. In this function, the following scenarios for f_t are implemented.

  • Piecewise-constant signal with Gaussian noise.

    Use contrast = "mean" and ht = FALSE here.

  • Piecewise-constant signal with heavy-tailed noise.

    Use contrast = "mean" and ht = TRUE here.

  • Piecewise-linear and continuous signal with Gaussian noise.

    Use contrast = "slope" and ht = FALSE here.

  • Piecewise-linear and continuous signal with heavy-tailed noise.

    Use contrast = "slope" and ht = TRUE here.

Value

A list with the following components:

cpt A vector with the detected change-points.

no_cpt The number of change-points detected.

fit A numeric vector with the estimated piecewise-linear mean signal.

solution_path A vector containing the solution path.

Author(s)

Andreas Anastasiou, anastasiou.andreas@ucy.ac.cy

See Also

ID_pcm, ID_plm, ht_ID_pcm, and ht_ID_plm, which are the functions that are employed in in ID, depending on which scenario is imposed by the input arguments.

Examples

single.cpt.mean <- c(rep(4,3000),rep(0,3000))
single.cpt.mean.normal <- single.cpt.mean + rnorm(6000)
single.cpt.mean.student <- single.cpt.mean + rt(6000, df = 5)
cpt.single.mean.normal <- ID(single.cpt.mean.normal)
cpt.single.mean.student <- ID(single.cpt.mean.student, ht = TRUE)

single.cpt.slope <- c(seq(0, 1999, 1), seq(1998, -1, -1))
single.cpt.slope.normal <- single.cpt.slope + rnorm(4000)
single.cpt.slope.student <- single.cpt.slope + rt(4000, df = 5)
cpt.single.slope.normal <- ID(single.cpt.slope.normal, contrast = "slope")
cpt.single.slope.student <- ID(single.cpt.slope.student, contrast = "slope", ht = TRUE)

IDetect documentation built on May 7, 2026, 5:09 p.m.