knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.path = "README-" )
This R data package intends to store 2010 Census ZIP Code Tabulation Area (ZCTA) Relationship files. So far it includes:
official US Census "2010 ZCTA to County Relationship File" (zcta_county_rel_10.rda
)
official US Census "2010 ZCTA to Tract Relationship File" (zcta_tract_rel_10.rda
)
ZIP Code to ZCTA crosswalk table developed by John Snow, Inc. (zipzcta.rda
)
Linking the USPS's ZIP codes to US counties is tedious:
ZIP codes do not resemble spatial entities; they are created for delivering mails. ZIP codes also change over time.
To get a truly spatial representations of ZIP codes, the US Census Bureau develops the concept of ZIP Code tabulation areas (ZCTAs), which approximates ZIP codes. But
Census does not release an official crosswalk between ZIP codes and ZCTAs.
Census does release relationship files between ZCTAs and counties, but at least 25% of the ZCTAs cannot be uniquely linked to counties.
A proposed solution: ZIP codes -> ZCTAs -> counties. This package contains data for connecting these links using official US Census relationship files and ZIP-to-ZCTA crosswalk files created by John Snow, Inc.
2010 ZIP Code Tabulation Area (ZCTA) Relationship File Layouts and Contents
UDS Mapper hosts the ZCTA-ZIP crosswalk file, which should be able to link ZIP codes to 2010 Census ZCTAs. Newer crosswalks are also available from UDS Mapper.
# install.packages("devtools") devtools::install_github("jjchern/zcta")
# devtools::install_github("jjchern/gaze") library(tidyverse) # ZCTA to counties zcta::zcta_county_rel_10 # ZIP codes to ZCTAs zcta::zipzcta # Show variable labels, and whether value label exists for certain variables # devtools::install_github("larmarange/labelled") labelled::var_label(zcta::zcta_county_rel_10) # Total number of zcta records nrow(zcta::zcta_county_rel_10) # Number of distinct zcta zcta::zcta_county_rel_10 %>% distinct(zcta5) %>% nrow() # In most instances the ZCTA code is the same as the ZIP Code for an area # But some zctas fall in more than one county # For example, there're 7060 zctas fall in 2 counties zcta::zcta_county_rel_10 %>% group_by(zcta5) %>% summarise(`Num of counties` = n()) %>% group_by(`Num of counties`) %>% summarise(`Num of zctas` = n()) # To get an one-to-one relationship between zcta and county, assign county to # a zcta if the zcta has the most population. For Example: # Before: zcta 601 fall in county 72001 and 72141 zcta::zcta_county_rel_10 %>% select(zcta5, state, county, geoid, poppt, zpoppct) # After: relate zcta 601 only to county 72001 as it accounts for 99.43% of the population one_to_one_pop <- zcta::zcta_county_rel_10 %>% select(zcta5, state, county, geoid, poppt, zpoppct) %>% group_by(zcta5) %>% slice(which.max(zpoppct)) %>% ungroup() one_to_one_pop # Or assign county to a zcta if the zcta accounts for most of the area. one_to_one_area <- zcta::zcta_county_rel_10 %>% select(zcta5, state, county, geoid, poppt, zpoppct, zareapct) %>% group_by(zcta5) %>% slice(which.max(zareapct)) %>% ungroup() one_to_one_area # Using either of these ZCTA-to-county tables, you can go from ZIP codes to ZCTAs to county zipcounty <- zcta::zipzcta %>% left_join(one_to_one_area, by = c("zcta" = "zcta5")) %>% select(zip, zcta, state = state.x, countygeoid = geoid) %>% arrange(zip) zipcounty # Merge the two 1 to 1 relationship datasets and identify zctas that have different county match one_to_one_pop %>% left_join(one_to_one_area, by = "zcta5") %>% select(zcta5, county.x, geoid.x, zpoppct.x, county.y, geoid.y, zpoppct.y) %>% filter(geoid.x != geoid.y) # Get county names for the 1 to 1 relationship dataset # Also keep just states and DC one_to_one_pop %>% mutate(geoid = as.integer(geoid)) %>% left_join(gaze::county10, by = "geoid") %>% select(zcta5, state, usps, county, geoid, name) %>% filter(state <= 56)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.