mcp: Fit Multiple Linear Segments And Their Change Points
In mcp: Regression with Multiple Change Points

View source: R/mcp.R

mcp	R Documentation

Fit Multiple Linear Segments And Their Change Points

Description

Given a model (a list of segment formulas), mcp infers the posterior distributions of the parameters of each segment as well as the change points between segments. See more details and worked examples on the mcp website. All segments must regress on the same x-variable. Change points are forced to be ordered using truncation of the priors. You can run fit = mcp(model, sample=FALSE) to avoid sampling and the need for data if you just want to get the priors (fit$prior), the JAGS code fit$jags_code, or the R function to simulate data (fit$simulate).

Usage

mcp(
  model,
  data = NULL,
  prior = list(),
  family = gaussian(),
  par_x = NULL,
  sample = "post",
  cores = 1,
  chains = 3,
  iter = 3000,
  adapt = 1500,
  inits = NULL,
  jags_code = NULL
)

Arguments

`model`	A list of formulas - one for each segment. The first formula has the format `response ~ predictors` while the following formulas have the format `response ~ changepoint ~ predictors`. The response and change points can be omitted (`changepoint ~ predictors` assumes same response. `~ predictors` assumes an intercept-only change point). The following can be modeled: Regular formulas: e.g., `~ 1 + x`). Read more. Extended formulas:, e.g., `~ I(x^2) + exp(x) + sin(x)`. Read more. Variance: e.g., `~sigma(1)` for a simple variance change or `~sigma(rel(1) + I(x^2))`) for more advanced variance structures. Read more Autoregression: e.g., `~ar(1)` for a simple onset/change in AR(1) or `⁠ar(2, 0 + x⁠`) for an AR(2) increasing by `x`. Read more
`data`	Data.frame or tibble in long format.
`prior`	Named list. Names are parameter names (`cp_i`, `int_i`, `xvar_i`, 'sigma“) and the values are either A JAGS distribution (e.g., `int_1 = "dnorm(0, 1) T(0,)"`) indicating a conventional prior distribution. Uninformative priors based on data properties are used where priors are not specified. This ensures good parameter estimations, but it is a questionable for hypothesis testing. `mcp` uses SD (not precision) for dnorm, dt, dlogis, etc. See details. Change points are forced to be ordered through the priors using truncation, except for uniform priors where the lower bound should be greater than the previous change point, `dunif(cp_1, MAXX)`. A numerical value (e.g., `int_1 = -2.1`) indicating a fixed value. A model parameter name (e.g., `int_2 = "int_1"`), indicating that this parameter is shared - typically between segments. If two varying effects are shared this way, they will need to have the same grouping variable. A scaled Dirichlet prior is supported for change points if they are all set to `⁠cp_i = "dirichlet(N)⁠` where `N` is the alpha for this change point and `N = 1` is most often used. This prior is less informative about the location of the change points than the default uniform prior, but it samples less efficiently, so you will often need to set `iter` higher. It is recommended for hypothesis testing and for the estimation of more than 5 change points. Read more.
`family`	One of `gaussian()`, `binomial()`, `bernoulli()`, or `poission()`. Only default link functions are currently supported.
`par_x`	String (default: NULL). Only relevant if no segments contains slope (no hint at what x is). Set this, e.g., par_x = "time".
`sample`	One of `"post"`: Sample the posterior. `"prior"`: Sample only the prior. Plots, summaries, etc. will use the prior. This is useful for prior predictive checks. `"both"`: Sample both prior and posterior. Plots, summaries, etc. will default to using the posterior. The prior only has effect when doing Savage-Dickey density ratios in `hypothesis`. `"none"` or `FALSE`: Do not sample. Returns an mcpfit object without sample. This is useful if you only want to check prior strings (fit$prior), the JAGS model (fit$jags_code), etc.
`cores`	Positive integer or "all". Number of cores. `1`: serial sampling. `options(mc.cores = 3)` will dominate `cores = 1` but not larger values of `cores`. `⁠>1⁠`: parallel sampling on this number of cores. Ideally set `chains` to the same value. Note: `cores > 1` takes a few extra seconds the first time it's called but subsequent calls will start sampling immediately. `"all"`: use all cores but one and sets `chains` to the same value. This is a convenient way to maximally use your computer's power.
`chains`	Positive integer. Number of chains to run.
`iter`	Positive integer. Number of post-warmup draws from each chain. The total number of draws is `iter * chains`.
`adapt`	Positive integer. Also sometimes called "burnin", this is the number of samples used to reach convergence. Set lower for greater speed. Set higher if the chains haven't converged yet or look at tips, tricks, and debugging.
`inits`	A list if initial values for the parameters. This can be useful if a model fails to converge. Read more in `jags.model`. Defaults to `NULL`, i.e., no inits.
`jags_code`	String. Pass JAGS code to `mcp` to use directly. This is useful if you want to tweak the code in `fit$jags_code` and run it within the `mcp` framework.

Details

Notes on priors:

Order restriction is automatically applied to cp_\* parameters using truncation (e.g., T(cp_1, )) so that they are in the correct order on the x-axis UNLESS you do it yourself. The one exception is for dunif distributions where you have to do it as above.
In addition to the model parameters, MINX (minimum x-value), MAXX (maximum x-value), SDX (etc...), MINY, MAXY, and SDY are also available when you set priors. They are used to set uninformative default priors.
Use SD when you specify priors for dt, dlogis, etc. JAGS uses precision but mcp converts to precision under the hood via the sd_to_prec() function. So you will see SDs in fit$prior but precision ($1/SD^2) in fit$jags_code

Value

An mcpfit object.

Author(s)

Jonas Kristoffer Lindeløv jonas@lindeloev.dk

Examples


# Define the segments using formulas. A change point is estimated between each formula.
model = list(
  response ~ 1,  # Plateau in the first segment (int_1)
  ~ 0 + time,    # Joined slope (time_2) at cp_1
  ~ 1 + time     # Disjoined slope (int_3, time_3) at cp_2
)

# Fit it and sample the prior too.
# options(mc.cores = 3)  # Uncomment to speed up sampling
ex = mcp_example("demo")  # Simulated data example
demo_fit = mcp(model, data = ex$data, sample = "both")

# See parameter estimates
summary(demo_fit)

# Visual inspection of the results
plot(demo_fit)  # Visualization of model fit/predictions
plot_pars(demo_fit)  # Parameter distributions
pp_check(demo_fit)  # Prior/Posterior predictive checks

# Test a hypothesis
hypothesis(demo_fit, "cp_1 > 10")

# Make predictions
fitted(demo_fit)
predict(demo_fit)
predict(demo_fit, newdata = data.frame(time = c(55.545, 80, 132)))

# Compare to a one-intercept-only model (no change points) with default prior
model_null = list(response ~ 1)
fit_null = mcp(model_null, data = ex$data, par_x = "time")  # fit another model here
demo_fit$loo = loo(demo_fit)
fit_null$loo = loo(fit_null)
loo::loo_compare(demo_fit$loo, fit_null$loo)

# Inspect the prior. Useful for prior predictive checks.
summary(demo_fit, prior = TRUE)
plot(demo_fit, prior = TRUE)

# Show all priors. Default priors are added where you don't provide any
print(demo_fit$prior)

# Set priors and re-run
prior = list(
  int_1 = 15,
  time_2 = "dt(0, 2, 1) T(0, )",  # t-dist slope. Truncated to positive.
  cp_2 = "dunif(cp_1, 80)",    # change point to segment 2 > cp_1 and < 80.
  int_3 = "int_1"           # Shared intercept between segment 1 and 3
)

fit3 = mcp(model, data = ex$data, prior = prior)

# Show the JAGS model
demo_fit$jags_code

mcp documentation built on May 29, 2024, 6:15 a.m.