getChangepoints: Find the optimal set of changepoints based on a significance...

View source: R/Empirical_P_Values.R

getChangepointsR Documentation

Find the optimal set of changepoints based on a significance level

Description

Considers results from running CROPS on PELT in order to assess which set of changepoints is the best. Does so using an empirical p-value process. It is strongly recommended that the results from this process are then trimmed for linear trends and/or seasonlity.

Usage

getChangepoints(series, alpha=0.01, numTrials=10000, serial=T, numCores=NA,  minPenalty=0, maxPenalty=10e12, verbose=T)

Arguments

series

A vector of observations on which to run changepoint analysis. This vector must not contain any missing values.

alpha

The significance level to use. Typical values include 0.01, 0.05, and 0.10.

numTrials

The number of simulations to use in order to obtain an empirical p-value. Recommended to be at least several thousand.

serial

Boolean indicating whether to run this function in serial or in parallel. Running in parallel with several cores will generally offer a substantial decrease in run-time.

numCores

The number of cores to use if serial=F. If not specified, then it will be taken as max(1, detectCores()-1).

minPenalty

The minimum penalty to be used when running CROPS on PELT. This corresponds to the first of two values fed to the pen.value argument in cpt.mean. It is suggested to keep this set to 0.

maxPenalty

The maximum penalty to be used when running CROPS on PELT. This corresponds to the second of two values fed to the pen.value argument in cpt.mean. This can be made arbitrarily large with relatively little impact on runtime. It is important that this value is large enough that no changepoints are identified under such a penalty in PELT. In rare cases, it might be necessary to increase this beyond the default value in order to get valid results.

verbose

If TRUE, prints out messages indicating progress.

Details

This function does not consider the possibility that false positives could arise due to trends or seasonality. Therefore, it is strongly recommended that the results from this function are trimmed for false positives. It is recommended to use the trimChangepoints function.

It is important to note that the series can not contain any missing values. It is not appropriate to run this changepoint analysis on segments with missing data ("gaps").

We generally recommend running in parallel (i.e. serial=F) using on the order of 2-4 cores, (e.g. numCores=2 or numCores=4). This should often provide a substantial speedup over serial computation.

Value

Returns a vector of changepoints. A changepoint is located immediately prior to a shift in the series.

Author(s)

Matthew Quinn

See Also

trimChangepoints

Examples

#Obtain changepoints for the simulated data. 10000 simulation trials may take some time.
simChangepoints <- getChangepoints(series=simSeries, alpha=0.01, numTrials=10000, serial=F, numCores = 2)

#Must then go on to trim changepoints.

matthewquinn1/changepointSelect documentation built on July 25, 2022, 7:12 p.m.