This vignette documents the mechanism provided in postCP 2.0. The document will first explain the theoretical basis and it will move on to explain the usage of the package.
The postCP Package is based on the change point model $$ \boldsymbol y[I_k] \sim \boldsymbol X[I_k] \times \boldsymbol \beta^k $$ where $\textbf{y} \in \mathbb{R} ^{n\times d}$ is the response variable and $\textbf{X} \in \mathbb{R}^{n\times p}$ is the covariate matrix. The response variable is generated with $K>1$ segments according to any Generalized Linear Model (ex: Gaussian or binomial regressions).
$I_k = [cp_{k−1} + 1, cp_k ]$ is the index of the individuals of the k th segment, and with the convention that $cp_0 = 0$ and that $cp_K = n$
For a given parameter $\boldsymbol \theta$, our objective is to compute the posterior change-point distribution: $$ \mathbb{P}(\textbf{cp} | \boldsymbol y,\boldsymbol X ; \boldsymbol \theta) $$ or, equivalently, the posterior segmentation distribution: $$ \mathbb{P}(\boldsymbol S | \boldsymbol y,\boldsymbol X ; \boldsymbol \theta) $$
The posterior distributions are calculated through a constrained HMM (Hidden Markov Model) coded in C++ with the aim of achieving efficiency in computations.
Let's look at an example,
require(postCP) #prepare data sigma=1.3 #Change point estimates bp=c(7,10) #Obtain data from longley dataset data = longley #Apply postcp function res = postcp(Employed ~ GNP + Armed.Forces,family=gaussian(),data=data,bp=c(7,10),sigma) #Plot the results plot.postcp(res,main="Posterior Change Point Probability Distribution") #Apply postcp function with maxFB=TRUE to obtain marginal distribution res = postcp(Employed ~ GNP + Armed.Forces,family=gaussian(),data=data,bp=c(7,10),sigma,maxFB=TRUE) #Plot the results plot.postcp(res,main="Posterior Change Point Probability Distribution")
The following example shows a more noticeable change.
require(postCP) data = longley plot(data$Armed.Forces) res = postcp(Armed.Forces ~ 1,family=gaussian(),data=data,bp=c(10),sigma=1) plot.postcp(res,main="Posterior Change Point Probability Distribution")
Some simulated examples are presented below for verification.
## change in the mean data = data.frame(signal=rnorm(150) + rep(c(1, 4), each=75), position=1:150) plot(data$signal) res = postcp(signal ~ 1 ,family=gaussian(),data=data,bp=c(40),sigma=1) plot.postcp(res,main="Posterior Change Point Probability Distribution") ## change in the slope position <- 1:150 data = data.frame(signal = rep(c(1, 2), c(50, 100))*position + rnorm(150), position=position) plot(data$signal) res = postcp(signal ~ position ,family=gaussian(),data=data,bp=c(30),sigma=1) plot.postcp(res,main="Posterior Change Point Probability Distribution")
The package was revived as a Google Summer of Code project in 2016 by Malith Jayaweera. Please see the link below for project details.
postCP Package Improvement
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.