knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.path = "man/figures/README-", out.width = "100%" ) commanism <- function(.) { scales::label_comma(accuracy = 1)(.) } set.seed(42)
This package provides an R interface to the United States Census Bureau's data API. The primary focus, as per its name, is pulling information from the detailed tables of the American Community Survey.
You can install the development version from GitHub with:
# install.packages("remotes") remotes::install_github("higherX4Racine/hercacstables")
The main way that one uses hercacstables
to interact with the Census API is
the fetch_data()
function.
fetch_data()
Here is a modestly complicated use of the function without any setup or post-processing.
#| label: fetch-pops-and-households #| cache: true POPS_AND_HOUSEHOLDS <- hercacstables::fetch_data( # the API works one year at a time year = hercacstables::most_recent_vintage("acs", "acs1"), # the API can only query one data source at a time survey_type = "acs", # the American Community Survey table_or_survey_code = "acs1", # the 1-year dataset of the ACS # the API fetches values for one or more instances of a specific geography for_geo = "state", # fetch values for entire states for_items = c( "11", # the District of Columbia "72" # Puerto Rico ), # the codes for the specific data that we are fetching variables = c( "NAME", # the geographic area's name "B01001_001E", # the total number of people "B11002_002E", # people in family households "B11002_012E" # people in nonfamily households ) )
#| label: show-pops-and-households #| echo: false POPS_AND_HOUSEHOLDS |> dplyr::mutate( Value = commanism(.data$Value) ) |> knitr::kable( align = "lrlrrr" )
The previous example is good enough for a README file. An approach centered on reusability is better for actual practice.
People working with the Census often know the topic that they are interested in,
but not the specific variables that provide information about that topic.
The hercacstables
package has utilities that help you search for Census
variables if you don't already know which ones you want to use.
A good first step is to search for tables that could be relevant with the
search_in_columns()
function and the built-in
METADATA_FOR_ACS_GROUPS
.
#| label: search-in-glossary EDUCATION_TABLES <- hercacstables::search_in_columns( hercacstables::METADATA_FOR_ACS_GROUPS, Group = "\\d$", # "Group" values need to end in a digit. Universe = "population", # "Universe" values need to include "population". Description = "enroll", # "Description values need to include "enroll" `-Description` = c("sex", # but not "sex", that's too much detail "computer", # or "computer", also too detailed "quarters", # or "quarters", also too detailed "insur", # or "insur", not health care enrollments "allocation") # or "allocation", these are metadata )
#| label: show-glossary_search-results #| echo: false knitr::kable(EDUCATION_TABLES)
Once you know the group that you are interested in, you will want to see what
information is captured in each of its rows.
The unpack_group_details()
function searches the built-in
METADATA_FOR_ACS_VARIABLES
for all of the variables of one group.
It then expands the Census's description of each variable into columns.
Users will need to identify the concepts that are captured by each column.
The following example unpacks the variables in the B14001 table.
#| label: unpack-B14001 UNPACKED_B14001 <- hercacstables::unpack_group_details("B14001")
#| label: show-B14001 #| echo: false UNPACKED_B14001 |> dplyr::filter(.data$Dataset == "ACS1") |> dplyr::select(!"Dataset") |> knitr::kable()
The package also includes functions for common special cases.
They are wrappers for fetch_data()
with some arguments hard-coded.
One example is pulling trends of racial/ethnic populations from the decennial census for some specific geographical units.
#| label: fetch-decennial-pops-by-race #| cache: true POPS_BY_RACE <- hercacstables::fetch_decennial_pops_by_race( for_geo = "state", # one cannot fetch the whole nation from 2000 or 2010 for_items = "*" # so we pull the data for every state ) |> dplyr::count( # then compute the nationwide populations for each vintage .data$`Race/Ethnicity`, .data$Vintage, wt = .data$Population, name = "Population" ) |> tidyr::pivot_wider( # and reshape the table for display names_from = "Vintage", values_from = "Population" )
#| label: show-decennial-pops-by-race #| echo: false POPS_BY_RACE |> dplyr::mutate( dplyr::across(tidyselect::ends_with("0"), commanism) ) |> knitr::kable( align = "lrrr" )
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.