dm_nbinom: Create a data model based on a negative binomial distribution

Description Usage Arguments Details Value Examples

View source: R/datamodel-constructors.R

Description

TODO - EDIT THIS

Usage

1
dm_nbinom(data, ratio, disp, nm_series, nm_data = NULL)

Arguments

data

A data frame, described in data-arg.

ratio

A data frame, identical to data, except that the "count" variable is replaced by a "ratio" variable giving expected coverage ratios.

disp

A single number or a data frame. If a data frame, it is identical to data, except that the "count" variable is replaced by a "disp" variable giving values for dispersion.

nm_series

The name of the demographic series that data describes.

nm_data

The name of the dataset. If no value supplied, then nm_data is assumed to equal the name of the object supplied as the data argument.

Details

Create a data model where the reported value has a negative binomial distribution. The negative binomial distribution has a mean mean-dispersion parameterisation.

ratio and disp can both be data frames or single numbers

ratio can be zero, but disp cannot. Neither can be negative.

The "ratio" column in data frame ratio gives expected coverage ratios, that is, the number of people or events that the dataset is expected to report for each actual person or event. If ratio$ratio[i] is the coverage ratio, and true$count[i] is the true number of people or events, then the expected value for data$count[i] is ratio$ratio[i] * true$count[i].

All elements ratio$ratio must be non-negative, and can only be NA if the corresponding value of data$data is.

The disp argument measures the amount of dispersion beyond what would be expected for a Poisson distribution. It equals the reciprocal of the size argument in NegBinomial Setting disp to 0 is equivalent to having Poisson variance, and setting disp to a higher number induces greater variable. In general, the less reliable the data source, the higher disp should be.

disp can be a single number, in which case all values of data have the same dispersion, or it can be a data frame with a column called "disp".

If disp is a single number, it must be non-negative, and cannot be NA. If disp is a data frame, all elements disp$disp must be non-negative, and can only be NA if the corresponding value of data$data is.d

If ratio or disp are data frames, then they do not need to have all the variables that are in data. Values for ratio or disp are assumed to be constant across the missing variables. For instance, if disp does not have a time variable, then values for dism are assumed to be constant across time.

If ratio and disp are data frames, then every row in data must map on to them. However, not every row in ratio and disp needs to map on to a row in data: any rows that do not map on to data are silently dropped.

Value

An object of class "dm_nbinom".

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
## Use a constant ratio across all categories
## but use higher dispersion for males than for females,
## and higher dispersion for ages 20-29 than for
## other age groups.
reg_popn <- account::gl_reg_popn
ratio <- 1
disp <- within(reg_popn, {
  rm(count)
  disp <- ifelse(gender == "Female", 1.1, 1.2)
  disp <- ifelse(age %in% 20:29, disp * 1.3, disp)
})
reg_popn_dm <- dm_nbinom(data = reg_popn,
                         ratio = ratio,
                         disp = disp,
                         nm_series = "population")
reg_popn_dm

ONSdigital/Bayesian-demographic-accounts documentation built on Jan. 10, 2022, 12:34 a.m.