set_datamod_undercount: Specify Undercount Data Model

View source: R/bage_mod-functions.R

set_datamod_undercountR Documentation

Specify Undercount Data Model

Description

Specify a data model for the outcome in a Poisson or binomial model, where the outcome is subject to undercount.

Usage

set_datamod_undercount(mod, prob)

Arguments

mod

An object of class "bage_mod", created with mod_pois() or mod_binom().

prob

The prior for the probability that a person or event in the target population will correctly enumerated. A data frame with a variable called "mean", a variable called "disp", and, optionally, one or more 'by' variables.

Details

The undercount data model assumes that reported values for the outcome variable understate the true values, because the reported values miss some people or events in the target population. In other words, the probability that any given unit in the target population will be included in the reported outcome is less than 1.

Value

A revised version of mod.

The prob argument

The prob argument specifies a prior distribution for the probability that a person or event in the target population is included in the reported outcome. prob is a data frame with a variable called "mean", a variable called "disp", and, optionally, one or more 'by' variables. For instance, a prob of

data.frame(sex = c("Female", "Male"),
           mean = c(0.95, 0.92),
           disp = c(0.02, 0.015))

implies that the expected value for the inclusion probability is 0.95 for females and 0.92 for males, with slightly more uncertainty for females than for males.

Mathematical details

The model for the observed outcome is

y_i^{\text{obs}} \sim \text{Binomial}(y_i^{\text{true}}, \pi_{g[i]})

\pi_g \sim \text{Beta}(m_g^{(\pi)} / d_g^{(\pi)}, (1-m_g^{(\pi)}) / d_g^{(\pi)})

where

  • y_i^{\text{obs}} is the observed outcome for cell i;

  • y_i^{\text{true}} is the true outcome for cell i;

  • \pi_{g[i]} is the probability that a member of the target population in cell i is correctly enumerated in that cell;

  • m_g is the expected value for \pi_g (specified via prob); and

  • d_g is disperson for \pi_g (specified via prob).

See Also

  • mod_pois() Specify a Poisson model

  • mod_binom() Specify a binomial model

  • augment() Original data plus estimated values, including estimates of true value for the outcome variable

  • components() Estimated values for model parameters, including inclusion probabilities and overcount rates

  • set_datamod_overcount() An overcount-only data model

  • set_datamod_miscount() An undercount-and-overcount data model

  • datamods All data models implemented in bage

  • confidential Confidentialization procedures modeled in bage

  • Mathematical Details vignette

Examples

## specify 'prob'
prob <- data.frame(sex = c("Female", "Male"),
                   mean = c(0.95, 0.97),
                   disp = c(0.05, 0.05))

## specify model
mod <- mod_pois(divorces ~ age * sex + time,
                data = nzl_divorces,
                exposure = population) |>
  set_datamod_undercount(prob)
mod

## fit model
mod <- mod |>
  fit()
mod

## original data, plus imputed values for outcome
mod |>
  augment()

## parameter estimates
library(dplyr)
mod |>
  components() |>
  filter(term == "datamod")

## the data have in fact been confidentialized,
## so we account for that, in addition
## to accounting for undercoverage
mod <- mod |>
 set_confidential_rr3() |>
 fit()
mod

bage documentation built on Nov. 5, 2025, 5:33 p.m.