anthro_prevalence: Compute prevalence estimates

View source: R/prevalence.R

anthro_prevalenceR Documentation

Compute prevalence estimates

Description

Prevalence estimates according to the WHO recommended standard analysis: includes prevalence estimates with corresponding standard errors and confidence intervals, and z-score summary statistics (mean and standard deviation) with most common cut-offs describing the full index distribution (-3, -2, -1, +1, +2, +3), and at disaggregated levels for all available factors (age, sex, type of residence, geographical regions, wealth quintiles, mother education and one additional factor the user is interested in and for which data are available).

Usage

anthro_prevalence(
  sex,
  age = NA_real_,
  is_age_in_month = FALSE,
  weight = NA_real_,
  lenhei = NA_real_,
  measure = NA_character_,
  oedema = "n",
  sw = NULL,
  cluster = NULL,
  strata = NULL,
  typeres = NA_character_,
  gregion = NA_character_,
  wealthq = NA_character_,
  mothered = NA_character_,
  othergr = NA_character_
)

Arguments

sex

A numeric or text variable containing gender information. If it is numeric, its values must be: 1 for males and 2 for females. If it is character, it must be "m" or "M" for males and "f" or "F" for females. No z-scores will be calculated if sex is missing.

age

A numeric variable containing age information; age can be in either days or months (if optional argument is_age_in_month is set to TRUE). An exact age in days is expected and should not be rounded if age is in months. Age-related z-scores will NOT be calculated if age is missing (NA).

is_age_in_month

A logical flag; if TRUE, variable age unit will be treated as months. The function converts it to days by dividing age by 30.4375 and rounding it to integer so that reference tables can be used. When unspecified, the default value FALSE is used and age unit is treated as days.

weight

A numeric variable containing body weight information, which must be in kilograms. Weight-related z-scores are not calculated if body weight is missing.

lenhei

A numeric variable containing length (recumbent length) or height (standing height) information, which must be in centimeters. Length/height-related z-scores will not be calculated if lenhei is missing. For children with age below 24 months (i.e. below 731 days) and standing height measured, the function converts it to recumbent length by adding 0.7 cm; and for children with age equal and above 24 months and measured in recumbent length, the function converts it to standing height by subtracting 0.7 cm. This way all the z-scores calculated by this function are length-based for children below 24 months, and height-based otherwise. This converted length/height according to age is assigned to the variable clenhei in the resulting data.frame.

measure

A character variable indicating whether recumbent length or standing height was measured for each observation. The values of this variable must be "L" or "l" for recumbent length, and "H" or "h" for standing height. Although it is highly recommended that this variable is provided according to the measurements taken in the survey, it is possible to run the analysis without specifying this variable. If unspecified, the default vector with all missing values is used. The function imputes the missing values according to the following algorithm:

  • If age is not missing, then it is recumbent length if age below 24 months (731 days), and standing height if age equal and above 24 months.

  • If age is missing, then it is recumbent length if measurement < 87 cm and standing height if measurement >= 87 cm.

oedema

The values of this character variable must be "n", "N" or "2" for non-oedema, and "y", "Y", "1" for oedema. Although it is highly recommended that this variable is provided by the survey, it is possible to run the analysis without specifying this variable. If unspecified, the default vector of all "n" with values considered as non-oedema is used. Missing values will be treated as non-oedema. For oedema, weight related z-scores (zwei, zwfl and zbmi) are NOT calculated (set to missing), BUT they are treated as being < -3 SD in the weight-related indicator prevalence (anthro_prevalence) estimation.

sw

An optional numeric vector containing the sampling weights. If NULL, no sampling weights are used.

cluster

An optional integer vector representing clusters. If the value is NULL this is treated as a survey without a cluster. This is also the case if all values are equal, then we assume there is also no cluster.

strata

An optional integer vector representing strata. Pass NULL to indicate that there are no strata.

typeres

An optional integer or character vector representing a type of residence. Any values are accepted, however, “Rural” or “Urban” are preferable for outputs purposes.

gregion

An optional integer or character vector representing a geographical region.

wealthq

An optional integer or character vector representing wealth quintiles where (1=poorest; 2,3,4,5=richest). All values can either be NA, or 1, 2, 3, 4, 5 or Q1, Q2, Q3, Q4, Q5.

mothered

An optional integer or character vector representing the education of the mother. Any number of categories is accepted for the analysis, provided sample sizes are sufficient in all categories. However, the common, standard recommended categories are no education, primary school, and secondary school or higher (“None”, “Primary” and “Secondary”). Note: Mother education refers to the highest level of schooling attained by the mother

othergr

An optional integer or character vector that is of interest for stratified analysis.

Details

In this function, all available (non-missing and non-flagged) z-score values are used for each indicator-specific prevalence estimation (standard analysis).

Note: the function temporarily sets the survey option survey.lonely.psu to "adjust" and then restores the original values. The function is a wrapper around the survey package to compute estimates for the different groups (e.g. by age or sex).

If not all parameter values have equal length, parameter values will be repeated to match the maximum length of all arguments except is_age_in_month using rep_len. This happens without warnings.

Value

Returns a data.frame with prevalence estimates for the various groups.

The output data frame includes prevalence estimates with corresponding standard errors and confidence intervals, and z-score summary statistics (mean and standard deviation) with most common cut-offs describing the full index distribution (-3, -2, -1, +1, +2, +3), and at disaggregated levels for all available factors (age, sex, type of residence, geographical regions, wealth quintiles, mother education and one additional factor the user is interested in and for which data are available).

The resulting columns are coded with a prefix, a prevalence indicator and a suffix:

Prefix:

HA

Height-for-age

WA

Weight-for-age

WA_2

Underweight

BMI

Body-mass-index-for-age

WH

Weight-for-height

HA_WH

Height-for-age and weight-for-height combined

Prevalence indicator:

_3

Prevalence corresponding to < -3 SD

_2

Prevalence corresponding to < -2 SD

_1

Prevalence corresponding to < -1 SD

1

Prevalence corresponding to > +1 SD

2

Prevalence corresponding to > +2 SD

3

Prevalence corresponding to > +3 SD

Suffix:

_pop

Weighted sample size

_unwpop

Unweighted sample size

_r

Mean/prevalence

_ll

lower 95% confidence interval limit

_ul

upper 95% confidence interval limit

_stdev

Standard Deviation

_se

Standard error

For example:

WHZ_pop

Weight-for-height weighted sample size

HA_r

Height-for-age z-score mean

WA_stdev

Weight-for-age z-score Standard Deviation

WH2_r

Prevalence of weight-for-height >+2 SD (overweight )

WH_r

Mean weight-for-height z-score

BMI_2_se

Prevalence of BMI-for-age <-2 SD standard error

BMI_3_ll

Prevalence of BMI-for-age <-3 SD lower 95% confidence interval limit

HA_2_WH_2_ul

Prevalence of children Height-for-age and weight-for-height combined (stunted & wasted) lower 95% confidence interval limit

Examples

## Not run: 
# because it takes too long for CRAN checks
library(anthro)

# compute the prevalence estimates for 100 random children
# with weight around 15kg and height around 100cm
res <- anthro_prevalence(
  sex = c(1, 2),
  age = 1000, # days
  weight = rnorm(100, 15, 1),
  lenhei = rnorm(100, 100, 10)
)

# Height-for-age
# We extract prevalence estimates for <-3SD, <-2SD (Stunting)
# and the z-score mean
col_names <- c("Group", "HAZ_unwpop", "HA_3_r", "HA_2_r", "HA_r")
res <- res[, col_names]

# rename the columns
colnames(res) <- c("Group", "Unweighted N", "-3SD", "-2SD", "z-score mean ")

# note that we only generated data for one age group
res

## End(Not run)

anthro documentation built on Sept. 17, 2023, 9:06 a.m.