asmop: Approximate Subset Multivariate Optimal Partitioning

Description Usage Arguments Details Value See Also Examples

View source: R/asmop.R

Description

A method which implements the Approximate Subset Multivariate Optimal Partitioning (A-SMOP) algorithm. This algorithm is capable of detecting the presence of changepoints in a multivariate time series, and identifies which of the variables are affected for each detected change.

Usage

1
2
3
asmop(data, alpha = 2 * log(nrow(data)), beta = 2 * log(ncol(data)) *
  log(nrow(data)), min.dist = 2, cost.func = "norm.meanvar.seglen",
  window.size, hard.restrict = TRUE, class = TRUE, verbose = FALSE)

Arguments

data

An n x p matrix representing a length n time series containing observations of p variables.

alpha

The variable-specific penalty, used to penalise the addition of a given changepoint into a given variable. A non-negative numeric value.

beta

The multivariate penalty, used to penalise the addition of a changepoint into the model. A non-negative numeric value.

min.dist

The minimum distance allowed between any two changepoints. Required to have an integer value of at least 2.

cost.func

The name of the (global) cost function used by the method, given as a string. See details for possible values.

window.size

The size of the window considered to the left and right of a given changepoint when performing subset restriction. A non-negative integer.

hard.restrict

Logical. If TRUE then hard subset restriction is used. If FALSE then soft subset restriction is used.

class

Logical. If TRUE then an object of class cptmv is returned.

verbose

Logical. If TRUE then information regarding the changepoint vector check-list is printed during the algorithm.

Details

This method implements the Approximate Subset Multivariate Optimal Partitioning (A-SMOP) algorithm of B. Pickering [2016]. This algorithm obtains the changepoint locations within a multivariate time series, and identifies the subsets of variables which are affected by each corresponding change. This is done via the minimisation of a penalised cost function using dynamic programming.

A range of different cost functions and penalty values can be used. The use of hard restriction provides a faster but more approximate solution, whereas soft restriction is more accurate but requires more computation. Note that using soft restriction can become very slow even for moderate p, for practical purposes we recommend using hard restriction.

Values currently supported for the cost function cost.func include:

"norm.mean" Used for detecting changes in mean in Normally-distributed data. Assumes fixed variance parameters (= 1). The mean parameters are set to their maximum likelihood estimates.
"norm.var" Used for detecting changes in variance in Normally-distributed data. Assumes fixed mean parameters (= 0). The variance parameters are set to their maximum likelihood estimates.
"norm.meanvar" Used for detecting changes in both mean and variance in Normally-distributed data. The mean and variance parameters are set to their maximum likelihood estimates.
"norm.mean.seglen", "norm.var.seglen", "norm.meanvar.seglen" Identical to "norm.mean", "norm.var" and "norm.meanvar", respectively, except these contain an additional log(segment length) penalty term in the likelihood for each variable. Designed for use when using the modified BIC penalty (Zhang and Siegmund, 2007) to penalise changes.

Value

If class=TRUE then an object of S4 class cptmv is returned. The slot cpts contains the changepoints that are returned. Otherwise, if class=FALSE, a list is returned containing the following elements:

data.set

The data set being analysed for changepoints.

cost.func

The name of the function used to calculate the cost.

cpt.type

The type of changes which are being detected, e.g. mean, mean and variance.

alpha

The value of the alpha penalty used.

beta

The value of the beta penalty used.

num.cpt.vecs

The number of changepoint vectors within the search-space considered.

cpt.vecs

A matrix containing the optimal changepoint vectors for the series.

like

The value of the likelihood for the optimal set of changepoint vectors.

cpts

The optimal changepoint locations in the series.

subsets

A logical matrix containing the optimal affected variable subsets for each of the detected changepoints.

runtime

The running time of the algorithm, in seconds.

See Also

smop

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
# Smaller example: Normal data, single change in mean at mid-point in 2/3 variables
n = 20; p=3
set.seed(100)
data = matrix(NA, n, p)
data[,1] = c( rnorm(n/2, 0, 1), rnorm(n/2, 10, 1) )
data[,2] = c( rnorm(n/2, 0, 1), rnorm(n/2, 10, 1) )
data[,3] = rnorm(n, 0, 1)
alpha = 2*log(n)
beta = 2*log(p)*log(n)
cost.func = "norm.mean.seglen"
window.size = 2
hard.restrict = TRUE
asmop.results = asmop(data=data, alpha=alpha, beta=beta, cost.func=cost.func, window.size=window.size, hard.restrict=hard.restrict)

# Larger example: Normal data, multiple changes in variance
data("var.change.ex")
#plot.ts(var.change.ex, nc=1)
n = nrow(var.change.ex) # 500
p = ncol(var.change.ex) # 6
asmop.results.hard = asmop(data=var.change.ex, alpha=2*log(n), beta=2*log(p)*log(n), cost.func="norm.var.seglen", window.size=10, hard.restrict=TRUE)
# WARNING: Using soft restriction (below) can take a few minutes.
#asmop.results.soft = asmop(data=var.change.ex, alpha=2*log(n), beta=2*log(p)*log(n), cost.func="norm.var.seglen", window.size=10, hard.restrict=FALSE) # Provides a better segmentation compared to hard.

benpickering/smop documentation built on Sept. 4, 2020, 1:45 a.m.