asmop: Approximate Subset Multivariate Optimal Partitioning
In benpickering/smop: Subset Multivariate Optimal Partitioning

Description Usage Arguments Details Value See Also Examples

A method which implements the Approximate Subset Multivariate Optimal Partitioning (A-SMOP) algorithm. This algorithm is capable of detecting the presence of changepoints in a multivariate time series, and identifies which of the variables are affected for each detected change.

1
2
3

asmop(data, alpha = 2 * log(nrow(data)), beta = 2 * log(ncol(data)) *
  log(nrow(data)), min.dist = 2, cost.func = "norm.meanvar.seglen",
  window.size, hard.restrict = TRUE, class = TRUE, verbose = FALSE)

`data`	An `n` x `p` matrix representing a length `n` time series containing observations of `p` variables.
`alpha`	The variable-specific penalty, used to penalise the addition of a given changepoint into a given variable. A non-negative numeric value.
`beta`	The multivariate penalty, used to penalise the addition of a changepoint into the model. A non-negative numeric value.
`min.dist`	The minimum distance allowed between any two changepoints. Required to have an integer value of at least 2.
`cost.func`	The name of the (global) cost function used by the method, given as a string. See details for possible values.
`window.size`	The size of the window considered to the left and right of a given changepoint when performing subset restriction. A non-negative integer.
`hard.restrict`	Logical. If `TRUE` then hard subset restriction is used. If `FALSE` then soft subset restriction is used.
`class`	Logical. If `TRUE` then an object of class `cptmv` is returned.
`verbose`	Logical. If `TRUE` then information regarding the changepoint vector check-list is printed during the algorithm.

This method implements the Approximate Subset Multivariate Optimal Partitioning (A-SMOP) algorithm of B. Pickering [2016]. This algorithm obtains the changepoint locations within a multivariate time series, and identifies the subsets of variables which are affected by each corresponding change. This is done via the minimisation of a penalised cost function using dynamic programming.

A range of different cost functions and penalty values can be used. The use of hard restriction provides a faster but more approximate solution, whereas soft restriction is more accurate but requires more computation. Note that using soft restriction can become very slow even for moderate p, for practical purposes we recommend using hard restriction.

Values currently supported for the cost function cost.func include:

`"norm.mean"`	Used for detecting changes in mean in Normally-distributed data. Assumes fixed variance parameters (`= 1`). The mean parameters are set to their maximum likelihood estimates.
`"norm.var"`	Used for detecting changes in variance in Normally-distributed data. Assumes fixed mean parameters (`= 0`). The variance parameters are set to their maximum likelihood estimates.
`"norm.meanvar"`	Used for detecting changes in both mean and variance in Normally-distributed data. The mean and variance parameters are set to their maximum likelihood estimates.
`"norm.mean.seglen"`, `"norm.var.seglen"`, `"norm.meanvar.seglen"`	Identical to `"norm.mean"`, `"norm.var"` and `"norm.meanvar"`, respectively, except these contain an additional log(segment length) penalty term in the likelihood for each variable. Designed for use when using the modified BIC penalty (Zhang and Siegmund, 2007) to penalise changes.

If class=TRUE then an object of S4 class cptmv is returned. The slot cpts contains the changepoints that are returned. Otherwise, if class=FALSE, a list is returned containing the following elements:

`data.set`	The data set being analysed for changepoints.
`cost.func`	The name of the function used to calculate the cost.
`cpt.type`	The type of changes which are being detected, e.g. mean, mean and variance.
`alpha`	The value of the alpha penalty used.
`beta`	The value of the beta penalty used.
`num.cpt.vecs`	The number of changepoint vectors within the search-space considered.
`cpt.vecs`	A matrix containing the optimal changepoint vectors for the series.
`like`	The value of the likelihood for the optimal set of changepoint vectors.
`cpts`	The optimal changepoint locations in the series.
`subsets`	A logical matrix containing the optimal affected variable subsets for each of the detected changepoints.
`runtime`	The running time of the algorithm, in seconds.

smop

# Smaller example: Normal data, single change in mean at mid-point in 2/3 variables
n = 20; p=3
set.seed(100)
data = matrix(NA, n, p)
data[,1] = c( rnorm(n/2, 0, 1), rnorm(n/2, 10, 1) )
data[,2] = c( rnorm(n/2, 0, 1), rnorm(n/2, 10, 1) )
data[,3] = rnorm(n, 0, 1)
alpha = 2*log(n)
beta = 2*log(p)*log(n)
cost.func = "norm.mean.seglen"
window.size = 2
hard.restrict = TRUE
asmop.results = asmop(data=data, alpha=alpha, beta=beta, cost.func=cost.func, window.size=window.size, hard.restrict=hard.restrict)

# Larger example: Normal data, multiple changes in variance
data("var.change.ex")
#plot.ts(var.change.ex, nc=1)
n = nrow(var.change.ex) # 500
p = ncol(var.change.ex) # 6
asmop.results.hard = asmop(data=var.change.ex, alpha=2*log(n), beta=2*log(p)*log(n), cost.func="norm.var.seglen", window.size=10, hard.restrict=TRUE)
# WARNING: Using soft restriction (below) can take a few minutes.
#asmop.results.soft = asmop(data=var.change.ex, alpha=2*log(n), beta=2*log(p)*log(n), cost.func="norm.var.seglen", window.size=10, hard.restrict=FALSE) # Provides a better segmentation compared to hard.