age_pyramid: Plot a population pyramid (age-sex) from a dataframe.

View source: R/age-pyramid.R

age_pyramidR Documentation

Plot a population pyramid (age-sex) from a dataframe.

Description

Plot a population pyramid (age-sex) from a dataframe.

Usage

age_pyramid(
  data,
  age_group = "age_group",
  split_by = "sex",
  stack_by = NULL,
  count = NULL,
  proportional = FALSE,
  na.rm = TRUE,
  show_midpoint = TRUE,
  vertical_lines = FALSE,
  horizontal_lines = TRUE,
  pyramid = TRUE,
  pal = NULL
)

Arguments

data

Your dataframe (e.g. linelist)

age_group

the name of a column in the data frame that defines the age group categories. Defaults to "age_group"

split_by

the name of a column in the data frame that defines the the bivariate column. Defaults to "sex". See NOTE

stack_by

the name of the column in the data frame to use for shading the bars. Defaults to NULL which will shade the bars by the split_by variable.

count

for pre-computed data the name of the column in the data frame for the values of the bars. If this represents proportions, the values should be within [0, 1].

proportional

If TRUE, bars will represent proportions of cases out of the entire population. Otherwise (FALSE, default), bars represent case counts

na.rm

If TRUE, this removes NA counts from the age groups. Defaults to TRUE.

show_midpoint

When TRUE (default), a dashed vertical line will be added to each of the age bars showing the halfway point for the un-stratified age group. When FALSE, no halfway point is marked.

vertical_lines

If you would like to add dashed vertical lines to help visual interpretation of numbers. Default is to not show (FALSE), to turn on write TRUE.

horizontal_lines

If TRUE (default), horizontal dashed lines will appear behind the bars of the pyramid

pyramid

if TRUE, then binary split_by variables will result in a population pyramid (non-binary variables cannot form a pyramid). If FALSE, a pyramid will not form.

pal

a color palette function or vector of colors to be passed to ggplot2::scale_fill_manual() defaults to the first "qual" palette from ggplot2::scale_fill_brewer().

Note

If the split_by variable is bivariate (e.g. an indicator for a specific symptom), then the result will show up as a pyramid, otherwise, it will be presented as a facetted barplot with with empty bars in the background indicating the range of the un-facetted data set. Values of split_by will show up as labels at top of each facet.

Examples


library(ggplot2)
old <- theme_set(theme_classic(base_size = 18))

# with pre-computed data ----------------------------------------------------
# 2018/2008 US census data by age and gender
data(us_2018)
data(us_2008)
age_pyramid(us_2018, age_group = age, split_by = gender, count = count)
age_pyramid(us_2008, age_group = age, split_by = gender, count = count)

# 2018 US census data by age, gender, and insurance status
data(us_ins_2018)
age_pyramid(us_ins_2018, 
  age_group = age,
  split_by = gender,
  stack_by = insured,
  count = count
)
us_ins_2018$prop <- us_ins_2018$percent/100
age_pyramid(us_ins_2018,
  age_group = age,
  split_by = gender,
  stack_by = insured,
  count = prop,
  proportion = TRUE
)

# from linelist data --------------------------------------------------------
set.seed(2018 - 01 - 15)
ages <- cut(sample(80, 150, replace = TRUE),
  breaks = c(0, 5, 10, 30, 90), right = FALSE
)
sex <- sample(c("Female", "Male"), 150, replace = TRUE)
gender <- sex
gender[sample(5)] <- "NB"
ill <- sample(c("case", "non-case"), 150, replace = TRUE)
dat <- data.frame(
  AGE = ages,
  sex = factor(sex, c("Male", "Female")),
  gender = factor(gender, c("Male", "NB", "Female")),
  ill = ill,
  stringsAsFactors = FALSE
)

# Create the age pyramid, stratifying by sex
print(ap <- age_pyramid(dat, age_group = AGE))

# Create the age pyramid, stratifying by gender, which can include non-binary
print(apg <- age_pyramid(dat, age_group = AGE, split_by = gender))

# Remove NA categories with na.rm = TRUE
dat2 <- dat
dat2[1, 1] <- NA
dat2[2, 2] <- NA
dat2[3, 3] <- NA
print(ap <- age_pyramid(dat2, age_group = AGE))
print(ap <- age_pyramid(dat2, age_group = AGE, na.rm = TRUE))

# Stratify by case definition and customize with ggplot2
ap <- age_pyramid(dat, age_group = AGE, split_by = ill) +
  theme_bw(base_size = 16) +
  labs(title = "Age groups by case definition")
print(ap)

# Stratify by multiple factors
ap <- age_pyramid(dat,
  age_group = AGE,
  split_by = sex,
  stack_by = ill,
  vertical_lines = TRUE
) +
  labs(title = "Age groups by case definition and sex")
print(ap)

# Display proportions
ap <- age_pyramid(dat,
  age_group = AGE,
  split_by = sex,
  stack_by = ill,
  proportional = TRUE,
  vertical_lines = TRUE
) +
  labs(title = "Age groups by case definition and sex")
print(ap)

# empty group levels will still be displayed
dat3 <- dat2
dat3[dat$AGE == "[0,5)", "sex"] <- NA
age_pyramid(dat3, age_group = AGE)
theme_set(old)

apyramid documentation built on Feb. 16, 2023, 10:53 p.m.