cptSubspace: Detecting changes in subspace
In grundy95/changepoint.cov: Implementation of Covariance Changepoint Methods

Description Usage Arguments Details Value References See Also Examples

Implements the \insertCiteGrundy2020;textualchangepoint.cov method for detecting changes in subspace in multivariate time series data. This method is aimed at time series that lie in a low-dimensional latent subspace.

cptSubspace(
  X,
  subspaceDim,
  threshold = "PermTest",
  numCpts = "AMOC",
  thresholdValue = 0.05,
  msl = ncol(X),
  nperm = 200,
  Class = TRUE
)

`X`	Data matrix of dimension n by p.
`subspaceDim`	Dimension of the latent subspace.
`threshold`	Threshold choice for determining significance of changepoints. Choices include: "PermTest" - Permutation test is performed using the number of permutations and significance level contained in the nperm and thresholdValue parameters respectively. "Manual" - A user chosen threshold is used, which is contained in the thresholdValue argument. If numCpts is numeric then the threshold is not used as the number of changepoints is known.
`numCpts`	Number of changepoints in the data. Choices include: "AMOC" - At Most One Changepoint; test to see if the data contains a single changepoint or not. "BinSeg"- Binary segmentation is performed to detect multiple changepoints. Numeric - User specified number of changepoints.
`thresholdValue`	Either the significance level of the permutation test when using threshold="PermTest" or the user defined threshold when using threshold="Manual".
`msl`	Minimum segment length allowed between the changepoints. NOTE this should be greater than or equal to p, the dimension of the time series.
`nperm`	Only required for threshold="PermTest". Number of permutations to use in the permutation test.
`Class`	Logical. If TRUE then an S4 class is returned. Else the estimated changepoints are returned.

Subspace changepoint detection is aimed at time series where we assume the data lies in a low-dimensional subspace; meaning there are a q dominating eigenvalues in the covariance matrix, where q is the assumed subspace dimension. This function calculates the test statistic, $T$ described in \insertCiteGrundy2020;textualchangepoint.cov. A data driven threshold is recommended by using the permutation test to determine the significance of changepoints. Note that this is a data driven threshold and will therefore vary in each calculation. The calculation of the threshold via the permutation test can also be computationally expensive especially for long time series (n>1000) or a large number of permutations (nperm>1000). The number of permutations should be altered to reflect the length of the time series - the longer the time series the more permutations may be necessary. If multiple changepoints are possible then Binary Segmentation is implemented however only one segment will be tested at each iteration in order to control the type 1 error. This method is only recommended if the data is assumed to lie in a low-dimension subspace and the dimensionality of this subspace is known. If this is not the case we recommend using the cptCov function and one of the contained methods within.

An object of S4 class cptCovariance-class is returned. If Class="FALSE", the vector of changepoints are returned.

\insertRef

Grundy2020changepoint.cov

cptCov, cptCovariance, subspaceDataGeneration, permutationTest,subspaceTestStat

set.seed(1)
dataAMOC <- subspaceDataGeneration(n=100,p=20,subspaceDim=5,tau=50,changeSize=0.5*sqrt(5))$data
dataMultipleCpts <- subspaceDataGeneration(n=200,p=20,subspaceDim=5,tau=c(50,100,150),
					changeSize=0.4*sqrt(5))$data

set.seed(1)
ansSubspace <- cptSubspace(X=dataAMOC,subspaceDim=5,nperm=100)
summary(ansSubspace)
subspaceEst(ansSubspace)

ansSubspace2 <- cptSubspace(X=dataMultipleCpts,subspaceDim=5,threshold='Manual',numCpts='BinSeg',
			thresholdValue=10,msl=30)
summary(ansSubspace2)
cptsSig(ansSubspace2)

ansSubspace3 <- cptSubspace(X=dataMultipleCpts,subspaceDim=5,numCpts=3,msl=30)
summary(ansSubspace3)