smoothHP
is an R package with a set of convenience functions and data for making population projections using a modified version of the Hamilton-Perry (HP) method outlined by Takashi Inoue 2016. Unlike the more prevalent cohort component method, the HP method allows for population projections to be made in the absence of fertility, mortality, and migration data, only relying on changes in population. The version utilized in this R package allows for hierarchical smoothing comparable to an empirical Bayes approach such that sub-population projections may more closely resemble their group average.
$$ ~{5}CCR^m{20}~=\frac{~{5}P^m{25,2015}}{~{5}P^m{20,2010}} $$
$$ \underbrace{~5^rSCCR^s_a}{\substack{\text{smoothed} \ \text{race CCR}}}~=\underbrace{~^rW}{\substack{\text{race} \ \text{weight}}}~\times\underbrace{~_5^rCCR^s_a}{\substack{\text{raw HP} \ \text{race CCR}}} ~+~\underbrace{~^r\Omega}{\substack{\text{county} \ \text{weight}}}~\times\underbrace{~_5CCR^s_a}{\substack{\text{raw HP} \ \text{county CCR}}} $$ $$ \begin{align} ~^rW~&=~\frac{\sqrt{~5^rP^s{a,y}}}{\sqrt{~5^rP^s{a,y}} + \sqrt{~5P^s{a,y}}} \ ~^r\Omega~&=~1 - ~^rW \end{align} $$ $$ \underbrace{~5^{rt}SSCCR^s_a}{\substack{\text{twice smoothed} \ \text{race-tract CCR}}}~=\underbrace{~^{rt}W}{\substack{\text{race-tract} \ \text{weight}}}~\times\underbrace{~_5^{rt}CCR^s_a}{\substack{\text{raw HP} \ \text{race-tract CCR}}} ~+~\underbrace{~^{rt}\Omega}{\substack{\text{race} \ \text{weight}}}~\times\underbrace{~_5^rSCCR^s_a}{\substack{\text{smoothed} \ \text{race CCR}}} $$ $$ \begin{align} ~^{rt}W~&=~\frac{\sqrt{~5^{rt}P^s{a,y}}}{\sqrt{~5^{rt}P^s{a,y}} + \sqrt{~5^rP^s{a,y}}} \ ~^{rt}\Omega~&=~1 - ~^{rt}W \end{align} $$
# Install directly from GitHub remotes::install_github("nmmarquez/smoothHP")
smoothHP
is designed to allow you to make projections testing multiple hierarchical smoothing strategies. To demonstrate this we use data from the Washington State Office of Fincance and Management official population estimates to make small area population projections for king county.
Currently smoothHP
requires strict input data requirements which are as follows
Sex
indicating either "Female" or "Male"Year
which is numeric, and contains at least two years that are five years apartAge5
which is a factor with 18 levels or integers 1 through 18 representing 5 year age groups from 0 to 85+value
representing a population estimates of the corresponding demographic group.To help with this process the package provides a utility function to assess compatibility with the package functions.
library(smoothHP) check_smooth_input(kc_pop_data)
The smoothHP
package is capable of making projections at multiple levels with various smoothing strategies using the main function multi_stage_group_HP_project
. Using the provided data lets say we were interested in making projections for all provided racial and ethnic groups without smoothing. The code would be the following
unsmooth_DF <- multi_stage_group_HP_project( kc_pop_data[kc_pop_data$Year %in% c(2010,2015)], stages = list(c("County", "Race")))
Here we provided the multi_stage_group_HP_project
a stages parameter which is a list of length 1 which the function interprets to only do only one layer of HP projections with no smoothing. If we instead we wanted to smooth race specific estimates to the county average we would provide a list of length 2 as seen below.
smooth_DF <- multi_stage_group_HP_project( kc_pop_data[kc_pop_data$Year %in% c(2010,2015)], stages = list("County", "Race"))
We can compare the smooth to the non-smoothed estimates visually to see the effects that the smoothing process had on the projections.
library(ggplot2) raceth_DF <- kc_pop_data[ ,.(value = sum(value)), by = list(Year, Sex, Race, Age5, County)] raceth_DF[,Type:="Data"] smooth_DF[,Type:="Smoothed"] unsmooth_DF[,Type:="Direct"] bind_rows(raceth_DF, smooth_DF, unsmooth_DF) %>% group_by(Year, Race, Type) %>% summarize(Population = sum(value) / 1000000) %>% ggplot(aes(x = Year, y = Population, color = Race, linetype = Type)) + geom_line() + theme_classic() + labs( y = "Population Estimate (in millions)", color = "Racial and Ethnic Categories") + ggtitle( "Population Forecasts by Race and Ethnicity", "Comparison of Smoothing Approaches")
Direct estimates using the HP method seem to provide implausible results in this case while using a smoothing approach gives us more reasonable growth projections for each group.
We may also want to do additional layers of smoothing to make race and ethnicity specific estimates at smaller areas (and subsequently smaller population estimates) which may be more unstable with direct estimates. Here again we demonstrate two approaches, one where we look at each small area estimate racial and ethic group and smooth them to the county population totals and another which introduces a two tiered smoothing approach. In this case rather than looking directly at the projections we instead examine the components of the projections, specifically the CWR estimate of the HP method.
anlyz_groups <- c( "Asian", "White", "AIAN" ) smooth_results <- bind_rows( multi_stage_CCR_estimates( kc_pop_data[kc_pop_data$Year %in% c(2010,2015)], stages = list(c("County", "Race", "GEOID"))) %>% mutate(Method = "HP Method"), multi_stage_CCR_estimates( kc_pop_data[kc_pop_data$Year %in% c(2010,2015)], stages = list("County", c("Race", "GEOID"))) %>% mutate(Method = "Single Stage\nSmoothing"), multi_stage_CCR_estimates( kc_pop_data[kc_pop_data$Year %in% c(2010,2015)], stages = list("County", "Race", "GEOID")) %>% mutate(Method = "Multi Stage\nSmoothing")) smooth_results %>% filter(Sex == "Male" & Race %in% anlyz_groups & Age5 == "25-29") %>% filter(is.finite(CCR)) %>% filter(Method != "HP Method") %>% ggplot(aes(x = CCR, fill = Race, color = Race)) + geom_density(alpha = .4) + facet_wrap(~Method, scales = "free_y") + theme_classic() + labs(x = "Cohort Change Ratio", y = "Density") + ggtitle( "Distribution of Male CCR Estimates by Race", "Comparison of Smoothing Approaches")
In this example by using a one stage smoothing approach, we may over-smooth estimates of CWR to the county average while a two stage approach allows more variation between racial and ethnic groups.
All of the functions provided by the smoothHP
package allow you to do any number of layers of smoothing for Projections or CWR and CCR estimates directly.
The development of this R
package and the forthcoming research articles associated with it were made possible by financial support from the following organizations.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.