Aggregate: Specify aggregate values.

AggregateR Documentation

Specify aggregate values.

Description

Specify values for aggregations of low-level parameters. Aggregate values can be used to provide extra information to a model, beyond the information contains in the main dataset. For instance, aggregate values can be used implement benchmarks or incorporate expert judgements.

Usage

AgCertain(value, weights = NULL, concordances = list())

AgNormal(value, sd, weights = NULL, concordances = list(), jump = NULL)

AgPoisson(value, concordances = list(), jump = NULL)

AgFun(value, sd, FUN, weights = NULL, concordances = list())

AgLife(value, sd, ax = NULL, concordances = list())

Arguments

value

The aggregate value or values. A single number, or, if there are multiple values, an object of class DemographicArray.

weights

An object of class Counts holding weights to be used when aggregating. Optional.

concordances

A named list of objects of class ManyToOne.

sd

Standard deviation(s) for errors. If value is a single number, then sd must be a single number; otherwise sd must be an object of class DemographicArray.

jump

The standard deviation of the proposal density used in Metropolis-Hastings updates.

FUN

A function taking arguments called x and weights and returning a single number. See below for details.

ax

An object of class Values holding estimated separation factors. Optional.

Details

Let \gamma_i be a rate, count, probability, or mean for cell i. For instance, \gamma_i could be the prevalence of obesity in a particular combination of age, educational status, and region, or it could be an age-sex-specific mortality rate during a given period. The \gamma_i are underlying parameters that are not observed directly.

Let \psi_j be a more aggregate parameter describing the same phenomenon as the \gamma_i. For instance, \psi_j could be the average prevalence of obesity in region j, or life expectancy for sex j. Like the \gamma_i, the \psi_j are not observed directly.

Typically, \psi_j is a weighted sum of the associated \gamma_i, that is,

\psi_j = \sum b_{ij} \gamma_i,

where b_{ij} > 0 if \gamma_i is associated with \psi_j, and 0 otherwise. For instance, if \gamma_i describes obesity prevalence for a subpopulation in region j, then b_{ij} > 0, and if it describes obesity prevanece in another region, then b_{ij} = 0.

However, more complicated relationships between the \psi_j and \gamma_j are also permitted. In the most general case,

\psi_j = f(B, \gamma),

where B is a matrix of b_{ij}, and f is an arbitrary function. For instance, f could be a (non-linear) function that takes a vector of age-specific mortality rates and returns life expectancy.

Let m_j be an estimate, prediction, or elicited value for aggregate parameter \psi_j. For instance, m_j could be a previously published estimate of obesity prevalence in region j, or it could be an expert's life expectancy forecast. In contrast to the \gamma_i and \psi_j, the m_j are observed. The m_j are 'aggregate values'.

Aggregate values are treated as data, and placed in the likelihood. To do so, a sub-model specifying the relationship between the m_j and \psi_j is required. A sub-model

p(m_j | \psi_j),

is, in effect, a model for the accuracy of the m_j.

Different choices for the relationship between (i) the \gamma_i and \psi_j, and (ii) the \psi_j and m_j are appropriate for different applications. The combinations that are currently available in demest are documented below.

Default values for the b_{ij} vary according to the model being used:

Model Default
Poisson with exposure exposure argument, normalised to sum to 1 for each j.
Poisson without exposure All weights equal to 1.
Binomial exposure argument, normalised to sum to 1 for each j.
Normal weights argument (which defaults to 1).

The concordances argument is needed when values has categories that are collapsed versions of categories of weights, or the underlying rates, probabilities, or means. For instance, values might be specified at the state level, while the rates are estimated at the county level. The mapping between the original and collapsed categories is known as a Concordance.

Value

An object of class SpecAggregate.

AgCertain

The aggregate parameters are weighted sums or means of the disaggregated parameters,

\psi_j = \sum b_{ij} \gamma_i,

and the aggregate values are treated as error-free,

m_j = \psi_j.

Although it is seldom realistic to treat an aggregate parameter as known with certainty, there can be pragmatic reasons for doing so. For instance, statistical agencies sometimes require that disaggregated estimates agree exactly with previously-published aggregate estimates. (Within the literature on small area estimation, this practice is known as 'benchmarking'.) With AgCertain, agreement is guaranteed. For instance, new estimates of obesity by age, educational status, and region can be made to agree with existing estimates of obesity by region.

AgNormal

The aggregate parameters are weighted sums or means of the disaggregated parameters,

\psi_j = \sum b_{ij} \gamma_i.

However, in contrast to AgCertain, the aggregate parameters are assumed to be observed with error. The errors have normal distributions, with mean 0 and standard deviation s_j, so that

m_j ~ N(\psi_j, s_j^2).

One possible application for AgNormal is 'inexact' benchmarking, where the disaggregated parameters are pulled towards the benchmarks, but complete agreement is not required. Another application is where expert judgements are treated as fallible.

AgPoisson

AgPoisson is used only with Poisson models that contain an exposure term. The aggregate parameters are rates, obtained using

\psi_j = \sum b_{ij} \gamma_i,

where the b_{ij} are proportional to exposures. Let n_j be exposure term associated with \psi_i. The expected count implied by \psi_j is then \psi_j n_j. The expected count is implied by aggregate value m_j is m_j n_j. The two expected counts are related by

m_j n_j ~ Poisson(\psi_j n_j).

AgFun

The aggregate parameters are obtained from the disaggregated parameters through a user-defined function f. Let \gamma_{[j]} denote the vector of \gamma_i associated with aggregate parameter \psi_j. Similarly let b_{[j]} denote the vector of b_{ij} associated with \psi_j. Then

\psi_j = f(\gamma_{[j]}, b_{[j]}).

AgFun uses the same model as AgNormal for the accuracy of the accuracy of the m_j,

m_j ~ N(\psi_j, s_j^2).

User-supplied function FUN must take two arguments, called x and weights, and return a numeric vector of length 1. The x argument is for the \gamma_{[j]} and the weights argument is for the b_{[j]}. The values for x supplied to FUN during estimation have class Values-class and the values for weights have class Counts-class. Function FUN can take advantage of the metadata attached to x and weights: see below for an example.

See Also

Aggregate values are typically specified as part of a call to function Model.

Examples

## Overall value of 0.8 known with certainty
AgCertain(0.8)

## Separate values for females and males known
## with certainty
value <- ValuesOne(c(0.5, 1.1),
                   labels = c("Female", "Male"),
                   name = "sex")
AgCertain(value)

## Non-default weights
weights <- Counts(array(c(0.6, 0.3, 0.2, 0.4, 0.2, 0.3),
                        dim = c(2, 3),
                        dimnames = list(sex = c("Female", "Male"),
                                        region = c("A", "B", "C"))))
AgCertain(value = value, weights = weights)

## Overall value of 0.8, with all errors having
## standard deviation of 0.1
AgNormal(value = 0.8, sd = 0.1)

## Aggregate values and errors that vary by sex
sd <- ValuesOne(c(0.15, 0.25),
                labels = c("Female", "Male"),
                name = "sex")
AgNormal(value = value, sd = sd)

## Non-default standard deviation for proposal density
AgNormal(value = value, sd = sd, jump = 0.02)

## Poisson model
AgPoisson(value)

## TODO - AgFun

StatisticsNZ/demest documentation built on Nov. 2, 2023, 7:56 p.m.