format_cohort_custom: Create consistent, complete customised cohorts

Description Usage Arguments Details Value See Also Examples

View source: R/format_cohort.R

Description

Given a vector of cohort labels, create a factor that contains levels for all cohorts defined by breaks. format_cohort_custom is the most flexible of the format_cohort functions in that the cohorts can have any combination of widths.

Usage

1
2
3
4
5
6
7
8
format_cohort_custom(
  x,
  breaks,
  open_first = NULL,
  month_start = "Jan",
  label_year_start = TRUE,
  label_open_multi = NULL
)

Arguments

x

A vector of cohort labels.

breaks

A vector of strictly increasing integer values.

open_first

Whether the oldest cohort has no lower limit.

month_start

An element of month.name, or month.abb. Cohorts start on the first day of this month.

label_year_start

Logical. Whether to label a cohort by the calendar year at the beginning of the cohort or the calendar year at the end. Defaults to TRUE.

label_open_multi

Whether intervals that are open on the left should be interpreted as multi-year or single-year labels.

Details

The elements of x must be multi-year labels such as "1950-1960" and "2020-2025", or labels for intervals that are open on the left, such as "<2000" and <"1960".

open_first defaults to TRUE if any of the intervals in x is open, and to FALSE otherwise.

If breaks has length 0, then open_first must be FALSE. If breaks has length 1, then open_first must be TRUE.

If x contains NA, then the levels of the factor created by format_cohort_custom also contain NA.

There is a combination of settings that make an open interval such as "<2010" ambiguous. The settings are

  1. x contains a mix of single-year labels such as "2018" and multi-year labels such as "2020-2025"

  2. month_start is not January.

  3. label_year_start is FALSE.

With these settings, it is unclear whether "<2010" should be treated as a type of single-year label, in which case it refers to the period before "2009-<month_start>-01", or as a type of multi-year label, in which case it refers to the period before "2010-<month_start>-01". Supplying a value for label_open_multi removes the ambiguity. When label_open_multi is TRUE, open intervals interpreted as a type of multi-year label, and when label_open_multi is FALSE they open intervals' are interpreted as a type of single-year label.

Value

A factor with the same length as x.

See Also

Other functions for reformating cohort labels are

date_to_cohort_year calculates cohorts from dates.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
format_cohort_custom(x = c(2019, 2011, 2000, 2015),
                     breaks = c(1990, 2000, 2020))

## change interpretation of single-year labels
format_cohort_custom(x = c(2019, 2011, 2000, 2015),
                     breaks = c(1990, 2000, 2020),
                     month_start = "Jul",
                     label_year_start = FALSE)

## multi-year labels
format_cohort_custom(x = c("2000", "2005-2010", "1995-1999"),
                     breaks = c(1990, 2000, 2020))

format_cohort_custom(x = c("2000", "2005-2010", "1995-1999"),
                     breaks = c(1995, 2005, 2010, 2020),
                     open_first = TRUE)

## supply value for 'label_open_multi' to remove
## ambiguity of the "<2000" label
format_cohort_custom(x = c("2025", "2030-2035", "<2021"),
                     breaks = c(2020, 2040),
                     month_start = "Jul",
                     label_year_start = FALSE,
                     label_open_multi = FALSE)

johnrbryant/demprep documentation built on Dec. 31, 2021, 11:58 a.m.