segregmixEM | R Documentation |
Returns ECM algorithm output for mixtures of multiple regressions with changepoints and arbitrarily many components.
segregmixEM(y, x, lambda = NULL, beta = NULL, sigma = NULL, k = 2, seg.Z, psi, psi.locs = NULL, delta = NULL, epsilon = 1e-08, maxit = 10000, verb = FALSE, max.restarts = 15)
y |
An n-vector of response values. |
x |
An nxp matrix of predictors. Note that this model assumes the presence of an intercept. |
lambda |
Initial value of mixing proportions. Entries should sum to
1. This determines number of components. If NULL, then |
beta |
Initial value of |
sigma |
A vector of standard deviations. If NULL, then 1/ |
k |
Number of components. Ignored unless all of |
seg.Z |
A list of length |
psi |
A kxp matrix specifying the number of changepoints for each predictor in each component. See below for more details. |
psi.locs |
A list of length |
delta |
An optional list of values quantifying the amount of separation at each changepoint if assuming discontinuities at the changepoints. This has the same dimensions as |
epsilon |
The convergence criterion. |
maxit |
The maximum number of iterations. |
verb |
If TRUE, then various updates are printed during each iteration of the algorithm. |
max.restarts |
The number of times to try restarting the ECM algorithm if estimation problems occur - such as choice of poor initial values or a poorly chosen changepoint structure. |
seg.Z
is defined as a list of right-hand side linear model formulas that are used to identify which predictors have changepoints in each component. For example, suppose you have a dataframe with three predictors: V1
, V2
, V3
. Suppose now that you wish to model a 3-component mixture of regressions with changepoints structure such that the first component has changepoints in V1 and V2, the second component has changepoints in V3
, and the third component has no changepoints. Then you would define seg.Z = list(~V1+V2, ~V3, NULL)
. Note that you MUST place the variables in order with respect to how they appear in the predictor matrix x
.
psi
is a kxp matrix specifying the number of changepoints for each predictor in each component. For the example given above, suppose there are three changepoints for V1
, two changepoints for V2
, and four changepoints for V3
. Then you would define psi = rbind(c(3, 2, 0), c(0, 0, 4), c(0, 0, 0))
.
psi.locs
is a list of length k whose elements give the initial locations of the changepoints for each component. Each element of the list must have length equal to the total number of changepoints for that component's regression equation. For the example given above, in component 1, assume that the three changepoints for V1
are at 3, 7, and 10 and the two changepoints for V1
are at 5, 20, and 30. In component 2, assume that the four changepoints for V3
are at 2, 4, 6, and 8. Then you would define psi.locs = list(c(3, 7, 10, 5, 20, 30), c(2, 4, 6, 8), NULL)
. Note that the order of the changepoints is determined by first sorting the predictors by how they appear in the formulas in seg.Z
and then sorting in increasing order within each predictor.
segregmixEM
returns a list of class segregmixEM
with items:
x |
The set of predictors. |
y |
The response values. |
lambda |
The final mixing proportions. |
beta |
The final regression coefficients. |
sigma |
The final standard deviations. |
seg.Z |
The list of right-hand side formulas as defined by the user. |
psi.locs |
A list of length k with the final estimates for the changepoint locations. |
delta |
A list of the delta values that were optionally specified by the user. |
loglik |
The final log-likelihood. |
posterior |
An nxk matrix of posterior probabilities for observations. |
all.loglik |
A vector of each iteration's log-likelihood. |
restarts |
The number of times the algorithm restarted due to unacceptable choice of initial values. |
ft |
A character vector giving the name of the function. |
As of version 0.4.6, this more general function has replaced the now defunct regmixEM.chgpt
and associated internal functions.
Young, D. S. (2014) Mixtures of Regressions with Changepoints, Statistics and Computing, 24(2), 265–281.
regmixEM
## Not run: ## Simulated example. set.seed(100) x <- 1:20 y1 <- 3 + x + rnorm(20) y2 <- 3 - x - 5*(x - 15)*(x > 15) + rnorm(20) y <- c(y1, y2) x <- c(x, x) set.seed(100) be <- list(c(3, -1, -5), c(3, 1)) s <- c(1, 1) psi.locs <- list(comp.1 = list(x = 15), comp.2 = NULL) out <- segregmixEM(y, cbind(1,x), verb = TRUE, k = 2, beta = be, sigma = s, lambda = c(1, 1)/2, seg.Z = list(~x, NULL), psi = rbind(1, 0), psi.locs = psi.locs, epsilon = 0.9) z <- seq(0, 21, len = 40) plot(x, y, col = apply(out$post, 1, which.max) + 1, pch = 19, cex.lab = 1.4, cex = 1.4) b <- out$beta d <- out$psi.locs lines(z, b[[1]][1] + b[[1]][2] * z + b[[1]][3] * (z - d[[1]][[1]]) * (z > d[[1]][[1]]) , col = 2, lwd = 2) lines(z, b[[2]][1] + b[[2]][2] * z, col = 3, lwd = 2) abline(v = out$psi.locs[[1]][1], col = 2, lty = 2) ## End(Not run) ## Not run: ## Example using the NOdata. data(NOdata) attach(NOdata) set.seed(100) be <- list(c(1.30, -0.13, 0.08), c(0.56, 0.09)) s <- c(0.02, 0.04) psi.locs <- list(comp.1 = list(NO = 1.57), comp.2 = NULL) out <- segregmixEM(Equivalence, cbind(NO), verb = TRUE, k = 2, beta = be, sigma = s, lambda = c(1, 1)/2, seg.Z = list(~NO, NULL), psi = rbind(1, 0), psi.locs = psi.locs, epsilon = 0.1) z <- seq(0, 5, len = 1000) plot(NOdata, col = apply(out$post, 1, which.max) + 1, pch = 19, cex.lab = 1.4, cex = 1.4, ylab = "Equivalence Ratio") b <- out$beta d <- out$psi.locs lines(z, b[[1]][1] + b[[1]][2] * z + b[[1]][3] * (z - d[[1]][[1]]) * (z > d[[1]][[1]]) , col = 2, lwd = 2) lines(z, b[[2]][1] + b[[2]][2] * z, col = 3, lwd = 2) abline(v = out$psi.locs[[1]][1], col = 2, lty = 2) detach(NOdata) ## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.