knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-",
  out.width = "100%"
)

R-CMD-check Coverage status

demogsurv

The goal of demogsurv is to:

For analysis of DHS data, the package interacts well with rdhs. See the vignette for an example.

Installation

You can install the development version from GitHub with:

# install.packages("devtools")
devtools::install_github("mrc-ide/demogsurv")

The package will be released on CRAN in due course.

Example

Load the package and example datasets created from DHS Model Datasets.

library(demogsurv)

data(zzbr) # Births recode (child mortality)
data(zzir) # Individuals recode (fertility, adult mortality)

Child mortality

By default, the function calc_nqx calculates U5MR by periods 0-4, 5-9, and 10-14 years before the survey. Before calculating mortality rates, create a binary variable indicator whether a death occurred and a variable giving the date of death, placed 0.5 months in the month the death occurred.

zzbr$death <- zzbr$b5 == "no"      # b5: child is alive ("yes" or "no")
zzbr$dod <- zzbr$b3 + zzbr$b7 + 0.5
u5mr <- calc_nqx(zzbr)
u5mr

Note that calc_nqx() does not reproduce child mortality estimates produced in DHS reports. calc_nqx() conducts a standard demographic rate calculation based on observed events and person years within each age group and then converts the cumulative hazard to survival probabilities. The standard DHS indicator uses a rule-based approach to allocate child deaths and person years across age groups and proceeds by calculating direct probabilities of death in each age group (see Rutstein and Rojas 2006). A function calc_dhs_u5mr() will reproduce the DHS calculation, but is not yet fully implemented.

Use the argument by= to specify factor variables by which to stratify the rate calculation.

calc_nqx(zzbr, by=~v102) # by urban/rural residence
calc_nqx(zzbr, by=~v190, tips=c(0, 10)) # by wealth quintile, 0-9 years before
calc_nqx(zzbr, by=~v101+v102, tips=c(0, 10)) # by region and residence

The sample covariance or correlation matrix of the estimates can be obtained via vcov().

vcov(u5mr)  # sample covariance
cov2cor(vcov(u5mr))  # sample correlation

Standard error estimation can be done via Taylor linearisation, unstratified jackknife, or stratified jackknife. Results are very similar.

calc_nqx(zzbr, varmethod = "lin") # default is linearization
calc_nqx(zzbr, varmethod = "jkn") # stratified jackknife (varmethod = "jkn")

## Compare unstratified standard error estimates for linearization and jackknife
calc_nqx(zzbr, strata=NULL, varmethod = "lin")  # unstratified design
calc_nqx(zzbr, strata=NULL, varmethod = "jk1")  # unstratififed jackknife

To calculate different child mortality indicators (neonatal, infant, etc.), specify different age groups over which to aggregate.

calc_nqx(zzbr, agegr=c(0, 1)/12)  # neonatal
calc_nqx(zzbr, agegr=c(1, 3, 5, 12)/12) # postneonatal
calc_nqx(zzbr, agegr=c(0, 1, 3, 5, 12)/12) # infant (1q0)
calc_nqx(zzbr, agegr=c(12, 24, 36, 48, 60)/12) # child (4q1)
calc_nqx(zzbr, agegr=c(0, 1, 3, 5, 12, 24, 36, 48, 60)/12) # u5mr (5q0)

Calculate annual ~5~q~0~ by calendar year (rather than years preceding survey).

calc_nqx(zzbr, period=2005:2015, tips=NULL)

Adult mortality

The function calc_nqx() can also used to calculate adult mortality indicators such as ~35~q~15~. First, the convenience function reshape_sib_data() transforms respondent-level data to a dataset with one row for each sibling reported. Then define a binary variable for whether the sibling is alive or dead.

zzsib <- reshape_sib_data(zzir)
zzsib$death <- factor(zzsib$mm2, c("dead", "alive")) == "dead"

Calculate ~35~q~15~ for the seven year period before the survey.

calc_nqx(zzsib, agegr=seq(15, 50, 5), tips=c(0, 7), dob="mm4", dod="mm8")

Calculate ~35~q~15~ by sex, replicating Table MM2.2.

zzsib$sex <- factor(zzsib$mm1, c("female", "male"))  # drop mm2 = 3: "missing"
calc_nqx(zzsib, by=~sex, agegr=seq(15, 50, 5), tips=c(0, 7), dob="mm4", dod="mm8")

This calculation exactly reproduces the ~35~q~15~ estiamtes produced for Table MM2 for DHS reports. Additional functionality will be added in future for producing ASMRs (Table MM1), MMR, and PM (Table MM3) will be added in future.

Fertility

The functions calc_asfr() and calc_tfr() calculate age-specific fertility rates and total fertility rate, respectively. The default calculation is by five-year age groups for three years before the survey, exactly reproducing the estimates produced in DHS reports.

## Replicate DHS Table 5.1.
## Total ASFR and TFR in 3 years preceding survey
calc_asfr(zzir, tips=c(0, 3))
calc_tfr(zzir)

## ASFR and TFR by urban/rural residence
reshape2::dcast(calc_asfr(zzir, ~v025, tips=c(0, 3)), agegr ~ v025, value.var = "asfr")
calc_tfr(zzir, by=~v025)
calc_tfr(zzir, by=~v025, varmethod="jkn")

Replicate fertility estimates stratified by various sociodemographic characteristics.

## Replicate DHS Table 5.2
calc_tfr(zzir, ~v102)  # residence
calc_tfr(zzir, ~v101)  # region
calc_tfr(zzir, ~v106)  # education
calc_tfr(zzir, ~v190)  # wealth quintile
calc_tfr(zzir)  # total

Generate estimates stratified by both calendar period and time preceding survey.

calc_tfr(zzir, period = c(2010, 2013, 2015), tips=0:5)

Calculate ASFR by birth cohort.

asfr_coh <- calc_asfr(zzir, cohort=c(1980, 1985, 1990, 1995), tips=NULL)
reshape2::dcast(asfr_coh, agegr ~ cohort, value.var = "asfr")

To Do



mrc-ide/demogsurv documentation built on March 21, 2022, 9:49 p.m.