set_datamod_overcount: Specify Overcount Data Model

View source: R/bage_mod-functions.R

set_datamod_overcountR Documentation

Specify Overcount Data Model

Description

Specify a data model for the outcome in a Poisson model, where the outcome is subject to overcount

Usage

set_datamod_overcount(mod, rate)

Arguments

mod

An object of class "bage_mod_pois", created with mod_pois().

rate

The prior for the overcoverage rate. A data frame with a variable called "mean", a variable called "disp", and, optionally, one or more 'by' variables.

Details

The overcount data model assumes that reported values for the outcome overstate the actual values. The reported values might be affected by double-counting, for instance, or might include some people or events that are not in the target population.

Value

A revised version of mod.

The rate argument

The rate argument specifies a prior distribution for the overcoverage rate. rate is a data frame with a variable called "mean", a variable called "disp", and, optionally, one or more 'by' variables. For instance, a rate of

data.frame(sex = c("Female", "Male"),
           mean = c(0.05, 0.03),
           disp = c(0.1, 0.15))

implies that the reported value for the outcome is expected to overstate the true value by about 5% for females, and about 3% for females, with greater unceratinty for males than females.

Mathematical details

The model for the observed outcome is

y_i^{\text{obs}} = y_i^{\text{true}} + \epsilon_i

\epsilon_i \sim \text{Poisson}(\kappa_{g[i]} \gamma_i w_i)

\kappa_g \sim \text{Gamma}(1/d_g, 1/(d_g m_g))

where

  • y_i^{\text{obs}} is the observed outcome for cell i;

  • y_i^{\text{true}} is the true outcome for cell i;

  • \epsilon_i overcount in cell i;

  • \gamma_i is the rate for cell i;

  • w_i is exposure for cell i;

  • \kappa_{g[i]} is the overcoverage rate for cell i;

  • m_g is the expected value for \kappa_g (specified via rate); and

  • d_g is disperson for \kappa_g (specified via rate).

See Also

  • mod_pois() Specify a Poisson model

  • augment() Original data plus estimated values, including estimates of true value for the outcome variable

  • components() Estimated values for model parameters, including inclusion probabilities and overcount rates

  • set_datamod_undercount() An undercount-only data model

  • set_datamod_miscount() An undercount-and-overcount data model

  • datamods All data models implemented in bage

  • confidential Confidentialization procedures modeled in bage

  • Mathematical Details vignette

Examples

## specify 'rate'
rate <- data.frame(sex = c("Female", "Male"),
                   mean = c(0.1, 0.13),
                   disp = c(0.2, 0.2))

## specify model
mod <- mod_pois(divorces ~ age * sex + time,
                data = nzl_divorces,
                exposure = population) |>
  set_datamod_overcount(rate)
mod

## fit model
mod <- mod |>
  fit()
mod

## original data, plus imputed values for outcome
mod |>
  augment()

## parameter estimates
library(dplyr)
mod |>
  components() |>
  filter(term == "datamod")

## the data have in fact been confidentialized,
## so we account for that, in addition
## to accounting for overcoverage
mod <- mod |>
 set_confidential_rr3() |>
 fit()
mod

bage documentation built on Nov. 5, 2025, 5:33 p.m.