replicate_data: Create Replicate Data

View source: R/bage_mod-methods.R

replicate_dataR Documentation

Create Replicate Data

Description

Use a fitted model to create replicate datasets, typically as a way of checking a model.

Usage

replicate_data(x, condition_on = NULL, n = 19)

Arguments

x

A fitted model, typically created by calling mod_pois(), mod_binom(), or mod_norm(), and then fit().

condition_on

Parameters to condition on. Either "expected" or "fitted". See details.

n

Number of replicate datasets to create. Default is 19.

Details

Use n draws from the posterior distribution for model parameters to generate n simulated datasets. If the model is working well, these simulated datasets should look similar to the actual dataset.

Value

A tibble with the following structure:

.replicate data
"Original" Original data supplied to mod_pois(), mod_binom(), mod_norm()
"Replicate 1" Simulated data.
"Replicate 2" Simulated data.
... ...
"Replicate <n>" Simulated data.

The condition_on argument

With Poisson and binomial models that include dispersion terms (which is the default), there are two options for constructing replicate data.

  • When condition_on is "fitted", the replicate data is created by (i) drawing values from the posterior distribution for rates or probabilities (the \gamma_i defined in mod_pois() and mod_binom()), and (ii) conditional on these rates or probabilities, drawing values for the outcome variable.

  • When condition_on is "expected", the replicate data is created by (i) drawing values from hyper-parameters governing the rates or probabilities (the \mu_i and \xi defined in mod_pois() and mod_binom()), then (ii) conditional on these hyper-parameters, drawing values for the rates or probabilities, and finally (iii) conditional on these rates or probabilities, drawing values for the outcome variable.

The default for condition_on is "expected". The "expected" option provides a more severe test for a model than the "fitted" option, since "fitted" values are weighted averages of the "expected" values and the original data.

As described in mod_norm(), normal models have a different structure from Poisson and binomial models, and the distinction between "fitted" and "expected" does not apply.

Data models for outcomes

If a data model has been provided for the outcome variable, then creation of replicate data will include a step where errors are added to outcomes. For instance, the a rr3 data model is used, then replicate_data() rounds the outcomes to base 3.

See Also

  • mod_pois(), mod_binom(), mod_norm() Create model.

  • fit() Fit model.

  • report_sim() Simulation study of model.

Examples

mod <- mod_pois(injuries ~ age:sex + ethnicity + year,
                data = nzl_injuries,
                exposure = 1) |>
  fit()

rep_data <- mod |>
  replicate_data()

library(dplyr)
rep_data |>
  group_by(.replicate) |>
  count(wt = injuries)

## when the overall model includes an rr3 data model,
## replicate data are rounded to base 3
mod_pois(injuries ~ age:sex + ethnicity + year,
         data = nzl_injuries,
         exposure = popn) |>
  set_datamod_outcome_rr3() |>
  fit() |>
  replicate_data()

bage documentation built on April 3, 2025, 8:53 p.m.