seg: estimate death registration coverage using the synthetic...

View source: R/bh.R

segR Documentation

estimate death registration coverage using the synthetic extinct generation method

Description

Given two censuses and an average annual number of deaths in each age class between censuses, we can use stable population assumptions to estimate the degree of underregistration of deaths. The method estimates age-specific degrees of coverage. The age pattern of these is assumed to be noisy, so we take the arithmetic mean over some range of ages. One may either specify a particular age-range, or let the age range be determined automatically. If the ages to fit against are not specified, then these are optimized. Part of this method relies on a prior value for remaining life expectancy in the open age group. By default, this is estimated using a standard reference to the Coale-Demeny West model life table, although the user may also supply a value.

Usage

seg(
  X,
  minA = 15,
  maxA = 75,
  minAges = 8,
  exact.ages = NULL,
  eOpen = NULL,
  nx.method = 2,
  deaths.summed = FALSE,
  mig.summed = deaths.summed,
  delta = FALSE,
  exact.ages.ggb = NULL,
  lm.method = "oldschool",
  opt.method = "r2"
)

Arguments

X

data.frame with columns, pop1, pop2, deaths, mig (optional), date1, date2, age, and id (if there are more than 1 region/sex/intercensal period).

minA

the lowest age to be included in search

maxA

the highest age to be included in search (the lower bound thereof)

minAges

the minimum number of adjacent ages to be used in estimating

exact.ages

optional. A user-specified vector of exact ages to use for coverage estimation

eOpen

optional. A user-specified value for remaining life-expectancy in the open age group.

nx.method

either 2 or 4. 4 is smoother.

deaths.summed

logical. is the deaths column given as the total per age in the intercensal period (TRUE). By default we assume FALSE, i.e. that the average annual was given.

mig.summed

logical. Is the (optional) net migration column mig given as the total per age in the intercensal period (TRUE). By default we assume FALSE, i.e. that the average annual was given.

delta

logical. Do we perform the so-called delta adjustment?

exact.ages.ggb

optional vector of ages used to estimate GGB coverage (if delta is TRUE)

lm.method

character, one of:

  • "oldschool" default sd ratio operation of still unknown origin

  • "lm" or "ols" for a simple linear model

  • "tls", "orthogonal", or "deming" for total least squares

  • "tukey", "resistant", or ""median" for Tukey's resistant line method

opt.method

what kind of residual do we minimize? choices "RMS","logRMS", "ORSS", "logORSS" (experimental)

Details

Census dates can be given in a variety of ways: 1) using Date classes, and column names $date1 and $date2 (or an unambiguous character string of the date, like, "1981-05-13") or 2) by giving column names "day1","month1","year1","day2","month2","year2" containing integers. If only year1 and year2 columns are given, then we assume January 1 dates. If year and month are given, then we assume dates on the first of the month. If you want coverage estimates for a variety of intercensal periods/regions/by sex, then stack them, and use a variable called $id with a unique values for each data chunk. Different values of $id could indicate sexes, regions, intercensal periods, etc. The $deaths column should refer to the average annual deaths in each age class in the intercensal period. Sometimes one uses the arithmetic average of recorded deaths in each age, or simply the average of the deaths around the time of census 1 and census 2. To identify an age-range in the traditional visual way, see plot.ggb(), when working with a single year/sex/region of data. The automatic age-range determination feature of this function tries to implement an intuitive way of picking ages that follows the advice typically given for doing so visually. We minimize the square of the average squared residual between the fitted line and right term. Finally, only specify eOpen when working with a single region/sex/period of data, otherwise the same value will be passed in irrespective of mortality and sex.

If exact.ages is specified as NULL, coverage is estimated by minimizing the RMSE of the coverage estimate versus $Cx.

Value

a data.frame with columns for the coverage coefficient $coverage, and the minimum $lower and maximum $upper of the age range on which it is based. Rows indicate data partitions, as indicated by the optional $id variable. $l25 ($u25) give the mean of the lower (upper) quartile of the distribution of age-specific coverage estimates.

References

Bennett Neil G, Shiro Horiuchi. Estimating the completeness of death registration in a closed population. Population Index. 1981; 1:207-221.

Preston, S. H., Coale, A. J., Trussel, J. & Maxine, W. Estimating the completeness of reporting of adult deaths in populations that are approximately stable. Population Studies, 1980; v.4: 179-202

Examples

# The Mozambique data
res <- seg(Moz)
res
# The Brasil data
BM <- seg(BrasilMales)
BF <- seg(BrasilFemales)
head(BM)
head(BF)

albinomatheus/toolbox documentation built on June 13, 2024, 5:42 a.m.