NOT_CRAN <- identical(tolower(Sys.getenv("NOT_CRAN")), "true") knitr::opts_chunk$set(collapse = TRUE, comment = "#>", eval = NOT_CRAN)
smoothbp fits hierarchical piecewise regression models with multiple
change-points, each smoothed by a logistic sigmoid. The general mean function
for $K$ change-points is:
$$ \mu_i = b0_i + b1_i(\tau_i - \omega_{1i}) + \sum_{k=1}^{K} \delta_{ki}(\tau_i - \omega_{ki})\, \text{logistic}!\left[(\tau_i - \omega_{ki})\rho_{ki}\right] $$
where
| Symbol | Role | |--------|------| | $\tau$ | time / covariate | | $\omega_k$ | location of the $k$-th change-point | | $\rho_k$ | transition sharpness for change-point $k$ ($\rho \to \infty$ → hard kink) | | $b0$ | level at the first change-point | | $b1$ | pre-change slope (anchored at $\omega_1$) | | $\delta_k$ | slope change attributable to change-point $k$ |
When $K = 0$ the model reduces to a standard hierarchical linear regression. When $K = 1$ it matches the classic two-phase piecewise model.
Each parameter has its own linear predictor specified via a formula.
b0 may also include a random intercept (1 | group) for clustered /
repeated-measures data.
Posterior inference uses a Metropolis-within-Gibbs sampler written in Rust:
While smoothbp is a general-purpose statistical tool, it is uniquely powerful for modeling structural changes across various scientific domains:
simulate_smoothbp() generates synthetic data from the model. It is
useful for testing and for understanding how the parameters relate to the
observable trajectory shape.
library(smoothbp) dat <- simulate_smoothbp( n_subj = 20, n_obs = 8, b0 = 5.0, # level at change-point b1 = -0.3, # pre-change slope delta = 1.2, # slope change (delta_1) omega = 3.0, # change-point location rho = 4.0, # transition sharpness sigma = 0.4, # residual SD sigma_u = 0.5, # between-subject SD tau_range = c(0, 6), seed = 42 ) head(dat) true_params(dat)
When deltas, omega, and rho are empty lists, smoothbp fits a
standard hierarchical linear regression:
fit0 <- smoothbp( formula = y ~ tau, b0 = ~ 1 + (1 | subject), b1 = ~ 1, deltas = list(), # no change-points omega = list(), rho = list(), data = dat, chains = 2L, iter = 1000L, warmup = 500L, seed = 42L, .verbose = FALSE ) summary(fit0)
fit1 <- smoothbp( formula = y ~ tau, b0 = ~ 1 + (1 | subject), b1 = ~ 1, deltas = list(~ 1), # one slope change omega = list(~ 1), # one change-point location rho = list(~ 1), # one sharpness data = dat, priors = smoothbp_priors( omega = list(prior_normal(3, 2, lb = 0)) ), chains = 4L, iter = 2000L, warmup = 1000L, seed = 42L ) summary(fit1)
For models with multiple structural breaks, smoothbp uses a "List-of-Formulas" API. You pass lists of equal length to deltas, omega, and rho. Each element in the list corresponds to the sub-model for that specific breakpoint.
# Fit a model with 3 candidate breakpoints fit3 <- smoothbp( formula = y ~ tau, b0 = ~ 1 + (1 | subject), b1 = ~ 1, # Each segment gets its own formula (here, just an intercept) deltas = list(~ 1, ~ 1, ~ 1), omega = list(~ 1, ~ 1, ~ 1), rho = list(~ 1, ~ 1, ~ 1), data = dat, priors = smoothbp_priors( # Use the space_omega_priors helper to initialize the search omega = space_omega_priors(K = 3, tau_min = 0, tau_max = 6) ) )
fixed()If you are performing an intervention analysis (e.g., a Regression Discontinuity Design or a Stepped-Wedge trial), you may already know exactly where the change points should be.
By using the fixed() helper, you tell the model not to estimate the location or sharpness, but only the magnitude of the change ($\delta$). This is much faster and allows for direct hypothesis testing of the intervention effect.
# Test for a hard kink at exactly tau = 3.0 fit_fixed <- smoothbp_ss( formula = y ~ tau, omega = list(fixed(3.0)), # Location is known rho = list(fixed(100)), # Sharpness is fixed (hard kink) data = dat ) # The PIP tells us the probability that the intervention had an effect pip(fit_fixed)
Note:
fixed()can also accept a vector of the same length as your data, allowing for observation-specific change points (e.g., different intervention times per cluster).Tip: If you are unsure how many change-points are present, use
smoothbp_ss()with spike-and-slab regularization — seevignette("spike-and-slab").
fit1 <- smoothbp( formula = y ~ tau, b0 = ~ 1 + (1 | subject), data = dat, chains = 4L, iter = 2000L, warmup = 1000L, seed = 42L, cores = 4L # run all 4 chains concurrently )
To make parallel the default for a project, add this to .Rprofile:
options(smoothbp.cores = parallel::detectCores())
print() displays results in distinct, labelled sections; summary() returns a
clean data frame. By default, effects = c("fixed", "ran_pars") is used, matching the conventions of other mixed-modeling packages like brms.
You can filter which parameters are returned or printed by passing a vector to the effects argument. The parameters are classified into three strict categories:
1. "fixed": Population-level structural parameters ($b0, b1, \delta, \omega, \rho, \sigma$)
2. "ran_pars": Group-level variance parameters (e.g., sigma_u)
3. "ran_vals": The actual group-level deviations (e.g., the individual u[j] random intercepts)
print(fit1) # fixed + ran_pars (default) print(fit1, effects = "fixed") # population-level only summary(fit1, effects = "all") # returns everything, including u[j]
Parameter names follow the convention:
| Name | Meaning |
|------|---------|
| b0_(Intercept) | Intercept (level at first change-point) |
| b1_(Intercept) | Pre-change slope |
| delta1_(Intercept) | Slope change at change-point 1 |
| omega1_(Intercept) | Change-point 1 location |
| rho1_(Intercept) | Change-point 1 sharpness |
| sigma | Residual SD |
| sigma_u | Between-group SD (random intercept) |
plot(fit1) # alias for trace_plot(fit1) trace_plot(fit1, type = "both") # trace + density
Parameters with $\hat{R} > 1.05$ or bulk-ESS $< 100$ are flagged with a ⚠ symbol and a red-tinted background. Thresholds are adjustable:
trace_plot(fit1, rhat_thresh = 1.01, ess_thresh = 400)
pp_check(fit1)
smoothbp_priors() accepts either a single prior applied to all
coefficients of a component, or a named list. For multi-breakpoint models,
omega and rho accept a list of priors (one per breakpoint):
smoothbp_priors( b0 = prior_normal(0, 10), b1 = prior_normal(0, 5), omega = list( prior_normal(2, 1), # change-point 1 prior_normal(5, 1) # change-point 2 ), rho = list( prior_normal(4, 2, lb = 0), prior_normal(4, 2, lb = 0) ), sigma = prior_invgamma(shape = 2, scale = 1) )
The lb and ub arguments impose hard bounds. rho should typically have lb = 0 (sharpness must be positive). For omega and deltas, it is often better to avoid hard bounds unless they are physically necessary for your domain.
Performance Tip:
smoothbpis significantly faster whenb1,deltas, andomegaare unconstrained. Without bounds, the sampler can use exact conjugate Gibbs updates forb1anddeltas, and standard HMC foromega. Truncated priors require additional checks and rejection sampling steps.
If you do use bounds for omega, ensure they cover the actual range of your time variable (e.g. if your time starts at -10, do not use lb = 0).
When the number of change-points is unknown, fit a model with $K$ potential
breakpoints using smoothbp_ss(). Each slope change $\delta_k$ has a binary
inclusion indicator $\gamma_k$; the posterior mean of $\gamma_k$ is the
posterior inclusion probability (PIP) for that breakpoint.
fit_ss <- smoothbp_ss( formula = y ~ tau, b0 = ~ 1 + (1 | subject), b1 = ~ 1, deltas = list(~ 1, ~ 1, ~ 1), # 3 candidate breakpoints omega = list(~ 1, ~ 1, ~ 1), rho = list(~ 1, ~ 1, ~ 1), data = dat, spike = prior_spike_slab(pi = 0.1, learn_pi = TRUE), chains = 4L, iter = 2000L, warmup = 1000L, seed = 42L ) # Posterior inclusion probabilities per breakpoint pip(fit_ss) #> gamma_delta1_(Intercept) gamma_delta2_(Intercept) gamma_delta3_(Intercept) #> 0.987 0.063 0.041
A PIP near 1 indicates that breakpoint is supported by the data; a low PIP indicates the slope change is not needed.
See vignette("spike-and-slab") for a full tutorial.
hypothesis() evaluates directional hypotheses against the posterior draws.
# Is the slope change at breakpoint 1 positive? smoothbp::hypothesis(fit1, "delta1_(Intercept) > 0") # Complex linear hypothesis: is the final slope (b1 + delta1) negative? smoothbp::hypothesis(fit1, "b1_(Intercept) + delta1_(Intercept) < 0") # Does the change-point fall before time 4? smoothbp::hypothesis(fit1, "omega1_(Intercept) < 4")
| Stars | ER | $P(H)$ |
|-------|----|--------|
| *** | > 99 | > 0.99 |
| ** | > 19 | > 0.95 |
| * | > 3 | > 0.75 |
# Compare 0-BP (linear) vs 1-BP vs 3-BP models loo0 <- loo(fit0) loo1 <- loo(fit1) loo3 <- loo(fit3) loo::loo_compare(loo0, loo1, loo3)
library(bridgesampling) bs0 <- bridge_sampler(fit0) bs1 <- bridge_sampler(fit1) bayes_factor(bs1, bs0)
# Posterior mean + 95% CI at training observations fitted(fit1) # Full posterior draws matrix (n_draws × n_obs) draws_mat <- fitted(fit1, summary = FALSE)
posterior drawsThe $draws component of a smoothbp fit is a standard draws_array object from the posterior package. You can use any of the posterior tools to manipulate and summarize the draws.
library(posterior) # Convert to a data frame for use with ggplot2 draws_df <- as_draws_df(fit1$draws) # Extract a specific parameter b1_draws <- draws_df$`b1_(Intercept)` # Compute custom summaries summarise_draws(fit1$draws, "mean", "sd", ~quantile(.x, c(0.1, 0.9)))
# Population-level predictions at new time points newdata_marginal <- data.frame(tau = seq(0, 6, by = 0.1)) fitted(fit1, newdata = newdata_marginal)
# Subject-specific predictions fitted(fit1, newdata = data.frame(tau = seq(0, 6, by = 0.1), subject = "1"))
model_methods() and model_results() generate structured, self-contained
text reports from a fitted model. Both functions are designed to be readily
adapted for use in a manuscript or report, or passed to a large language model (LLM): the
output includes the full algebraic specification with every symbol defined and
every parameter value or prior stated explicitly, so no package documentation
is needed to interpret the report.
The return value of each function is the complete report as an invisible character string, so you can capture it for programmatic use:
methods_txt <- model_methods(fit1) results_txt <- model_results(fit1)
model_methods() also emits a brief convergence note: if any Rhat values
exceed 1.05, effective sample sizes fall below 100, or divergent transitions
are present, a warning is printed with suggested diagnostics so that
model-specification issues can be caught before results are reported.
model_methods()model_methods() describes how the analysis was done. It produces:
model_methods(fit1)
model_results()model_results() describes what was found. It produces:
model_results(fit1)
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.