clean_age: Clean up age group labels
In johnrbryant/demprep: Prepare Demographic Data

Description Usage Arguments Details Value See Also Examples

Parse age group labels and convert them to the format used by the dem packages.

1
2
3

clean_age(x, language = "English")

clean_age_df(x, language = "English")

`x`	A numeric or character vector.
`language`	The language in which text labels are written. Defaults to English.

Intervals that are open on the right such as "80+" are allowed. Intervals that are open on the left such as "<20" are not.

By default, clean_age assumes that any text labels are written in English. However, other languages can be specified using the language argument. Current choices are ADD OVER TIME.

clean_age also checks for two special cases: (i) when the labels consist entirely of numbers 0, 5, 10, ..., and (ii) when the labels consist entirely of the numbers 0, 1, 5, 10, .... In case (i) the labels are converted to the age groups "0-4", "5-9", "10-14", .... In case (ii) the labels are converted to the life table age groups "0", "1-4", "5-9", .... In both cases, the maximum age must be must be at least 50.

Function clean_age_df returns a data frame showing how each unique element in x is interpreted by function clean_age and whether the element can be interpreted as a valid age group label.

clean_age does not remove month or quarter labels, as this could result in ambiguity when different age groups use different units.

clean_age returns a character vector with the same length as x in which labels that have been parsed are translated to dem formats. clean_age_df returns a data frame with columns "input", "output", and "is_valid".

is_valid_age, clean_cohort, clean_period

x <- c("100 and over",
       "<10",         
       "infants",
       "10 to 19 years",
       "infants",
       "untranslatable",
       "10-19",
       "100 quarters",
       "also untranslatable",
       "three months")
x
clean_age(x)
clean_age_df(x)

## 5-year age groups defined by starting age
x <- sample(seq(0, 80, 5))
x
clean_age(x)
clean_age_df(x)

## age groups commonly used by life tables
x <- sample(c(0, 1, seq(5, 80, 5)))
x
clean_age(x)
clean_age_df(x)