View source: R/bage_mod-functions.R
| set_datamod_exposure | R Documentation |
Specify a data model for the exposure variable in a Poisson model. The data model assumes that, within each cell, observed exposure is drawn from an Inverse-Gamma distribution. In this model,
E[ expected exposure | true exposure ] = true exposure
and
sd[ expected exposure | true exposure ] = cv \times true exposure
where cv is a coefficient of variation parameter.
set_datamod_exposure(mod, cv)
mod |
An object of class |
cv |
Coefficient of variation
for measurement errors in exposure.
A single number, or a data frame
with a variable called |
In the exposure data model, cv, the coefficient
of variation, does not depend on
true exposure. This implies that
errors do not fall, in relative terms,
as population rises. Unlike sampling errors,
measurement errors do not get averaged away
in large populations.
The exposure data model assumes that the exposure variable
is unbiased. If there is in fact evidence
of biases, then this evidence should be
used to create a de-biased version of the
variable (eg one where estimated biases
have been subtracted) to supply to
mod_pois().
set_datamod_exposure() can only be used
with a Poisson model for rates in which
the dispersion in the rates has been set to zero.
The dispersion in the rates can be set
explicitly to zero using set_disp(),
though set_datamod_exposure() will also
do so.
A revised version of mod.
cv argumentcv can be a single number, in which
case the same value is used for all cells.
cv can also be a data frame with a
with a variable called "cv" and
one or more columns with 'by' variables.
For instance, a cv of
data.frame(sex = c("Female", "Male"),
cv = c(0.01, 0.012))
implies that the coefficient of variation is 0.01 for females and 0.012 for males.
See below for an example where the coefficient of variation is based on aggregated age groups.
The model for observed exposure is
w_i^{\text{obs}} \sim \text{InvGamma}(2 + d_{g \lbrack i \rbrack }^{-1}, (1 + d_{g \lbrack i\rbrack }^{-1}) w_i^{\text{true}})
where
w_i^{\text{obs}} is observed exposure for cell i
(the exposure argument to mod_pois());
w_i^{\text{true}} is true exposure for cell i; and
d_{g\lbrack i\rbrack } is the value for dispersion
that is applied to cell i.
cv is \sqrt{d_g}.
mod_pois() Specify a Poisson model
set_disp() Specify dispersion of rates
augment() Original data plus estimated values,
including estimates of true value for exposure
datamods Data models implemented in bage
confidential Confidentialization
procedures modeled in bage
Mathematical Details vignette
## specify model
mod <- mod_pois(injuries ~ age * sex + year,
data = nzl_injuries,
exposure = popn) |>
set_disp(mean = 0) |>
set_datamod_exposure(cv = 0.025)
## fit the model
mod <- mod |>
fit()
mod
## examine results - note the new variable
## '.popn' with estimates of the true
## population
aug <- mod |>
augment()
## allow different cv's for each sex
cv_sex <- data.frame(sex = c("Female", "Male"),
cv = c(0.03, 0.02))
mod <- mod |>
set_datamod_exposure(cv = cv_sex)
mod
## our outcome variable is confidentialized,
## so we recognize that in the model too
mod <- mod |>
set_confidential_rr3()
mod
## now a model where everyone aged 0-49
## receives one value for cv, and
## everyone aged 50+ receives another
library(poputils) ## for 'age_upper()'
library(dplyr, warn.conflicts = FALSE)
nzl_injuries_age <- nzl_injuries |>
mutate(age_group = if_else(age_upper(age) < 50,
"0-49",
"50+"))
cv_age <- data.frame(age_group = c("0-49", "50+"),
cv = c(0.05, 0.01))
mod <- mod_pois(injuries ~ age * sex + year,
data = nzl_injuries_age,
exposure = popn) |>
set_disp(mean = 0) |>
set_datamod_exposure(cv = cv_age)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.