This article introduces predict(), fitted(), residuals() for in-sample and out-of-sample data. I will also show how to get creative with mcp, including how to make predictions around future change points.

Preparation: an example model

We need an mcpfit to get started. We take the "demo" dataset:

library(mcp)
options(mc.cores = 3)  # Speed up sampling
set.seed(42)  # Make the script deterministic

ex = mcp_example("demo")
head(ex$data)

... and model it as three segments, i.e., two change points:

# Define the model
model = list(
  response ~ 1,  # plateau (int_1)
  ~ 0 + time,    # joined slope (time_2) at cp_1
  ~ 1 + time     # disjoined slope (int_3, time_3) at cp_2
)

# Fit it
fit = mcp(model, data = ex$data)

This is what the data and the inferred fit looks like with 95% credible interval and a 80% prediction interval:

plot(fit, q_fit = TRUE, q_predict = c(0.1, 0.9))

To review what we see here:

Behind the scenes, plot() merely calls predict() and fitted() to show these inferences.

Extracting fitted() values

To get the fitted values for each data point, simply do fitted(fit):

head(fitted(fit))

In general, this output will include:

If you compare these values to the plot, you will see that they correspond. plot() merely calls fitted() behind the scenes.

Expected values for out-of-sample data

To predict out-of-sample data, you can simply use the newdata argument.

newdata = data.frame(time = c(ex$data$time[1], 25, -20, 200))
fitted(fit, newdata = newdata)

Note that:

Arguments

If you look at the documentation for predict(), fitted(), and residuals(), you'll see that they are quite versatile, taking many different arguments. To mention a few, you can set which_y = "sigma" to get fitted values for sigma more on modeling sigma, prior = TRUE to predict using only the prior, varying = TRUE and arma = FALSE can be toggled to include/exclude AR(N) and varying effects.

Predictions

predict() is the posterior predictive and it takes exactly the same arguments as fitted(). This means that you can make predictions for in-sample and out-of-sample data as well. As with fitted(), plot() uses predict() under the hood to plot prediction intervals. You can see that the values correspond to dashed green lines in the plot (the 80% prediction interval):

set.seed(42)
head(predict(fit, probs = c(0.1, 0.9)))

Note that predict() uses random sampling under the hood, so these values will differ slightly from call to call. You can make it replicable using set.seed() as above. In general, the more posterior samples, the less the call-to-call variance will be. Conversely, fewer samples means more call-to-call variation, e.g., if you do predict(fit, nsamples = 10)).

Residuals

residuals() is simply data$response - fitted(). It may be useful for model checking, but the typical needs are covered using posterior predictive checking (pp_check(fit)) and visual inspection of plot(fit, q_fit = TRUE, q_predict = TRUE).

Forecasting with future change points

Bayesian inference is the principled updating of prior knowledge using data. Where there is little or now data, the prior speaks louder. Sometimes, we can learn surprising stuff simply by inspecting the prior predictive, e.g., how the priors combine when "put through" the model. In mcp, most functions come with a prior = FALSE default, but you can simply do plot(fit, prior = TRUE), fitted(fit, prior = TRUE), or predict(fit, prior = TRUE).

Say you want to forecast at time = 125 and you know that a changepoint to the baseline level (an intercept change to int_1) will occur approximately after the same interval as between cp_1 and cp_2 (i.e., at cp_2 + (cp_2 - cp_1). Here is a way to "hack" mcp to do this. (NOTE: I plan on implementing this in a much more user-friendly way in a future release; see the discussion in this github issue and current status in this github issue).

Step 1: run the model for observed data

We already did that above, resulting in our fit. But we only do it to get the default priors that are suitable for inferring change point in this region, so you could've just run it without sampling:

fit = mcp(model, data = ex$data, sample = FALSE)

Step 2: add the unobserved segment(s) and fit

Now we extend the model with the future segment of which we have prior knowledge:

model_forecast = c(fit$model, list(
  ~ 1     # intercept (int_4) after cp_3
))

And finally, we extend the list of priors with the two new parameters (time_4 and cp_3). It may be helpful to review the article on priors in mcp.

prior_forecast = c(fit$prior, list(
  int_4 = "int_1",  # Return to this value
  cp_3 = "dnorm(cp_2 + (cp_2 - cp_1), 20) T(MAXX, )"  # In the future at the same interval
))

Now let's fit it:

fit_forecast = mcp(model_forecast, data = ex$data, prior = prior_forecast)

Step 3: predict!

We can go right ahead and compute our 50% and 80% prediction intervals at time = 125:

predict(fit_forecast, newdata = data.frame(time = 125), probs = c(0.1, 0.25, 0.75, 0.9))

To really understand what's going on here, it may be helpful to visualize the model. For now, we will have to hack this a bit too, manually doing our plot:

# Get posterior and posterior predictive "predictions"
newdata = data.frame(time = 1:170)
fitted_forecast = fitted(fit_forecast, newdata = newdata, summary = FALSE, nsamples = 50)
predict_forecast = predict(fit_forecast, newdata = newdata, summary = FALSE)

# Plot it
library(ggplot2)
ggplot(predict_forecast, aes(x = time, y = predict)) + 
  # Prediction intervals and line at x = 125
  stat_summary(fun.data = median_hilow, fun.args = list(conf.int = 0.8), geom = "ribbon", alpha = 0.2) +
  stat_summary(fun.data = median_hilow, fun.args = list(conf.int = 0.5), geom = "ribbon", alpha = 0.3) + 
  geom_vline(xintercept = 125, lty = 2, lwd = 1) + 

  # Lines for fitted draws
  geom_line(aes(y = fitted, group = .draw), data = fitted_forecast, alpha = 0.2) + 

  # Observed data
  geom_point(aes(x = time, y = response), data = ex$data) + 
  labs(title = "Predicting with future change points")

You can read the predicted values from above at x = 125 off this graph. We literally just predicted for all values between 1 and 170, and visualized it using a ribbon. This means that you can also predict further into the future, if you'd like.

You can extend this approach to an arbitrary number of future segments, even using the posterior from the "unobserved" segment 4 in the priors for parameters in future segments. In Bayesian inference, it really does not make much of a difference whether credence in some parameter values have been updated using data or not - it's all credence.

Without doing this formal model of the future change point, one may have thought that the change point should occur around time = 110 since that's the expected value of cp_2 + (cp_2 - cp_1). However, we truncated the prior for the future change point (cp_3) so that it occurs after the last data point (MAXX), i.e., at time > 100. This is knowledge that the third change point had not yet been observed at time = 100, and this pushes the distribution further into the future (actually around 118; see summary(fit_forecast)).



lindeloev/mcp documentation built on Oct. 2, 2024, 1:52 a.m.