knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

Obtaining a sample dataset

For those who need a more powerful method for crosswalking semi-large datasets and allocating fields based on the provided ratios, the crosswalk() function is available.

library(rhud)
options (digits=4)

sample <- data.frame(pop = c(151049, 103609, 52276),
                     county = c("24043", "24045", "24047"))

head(sample)

This sample dataset contains population information for the counties of Washington, Wicomico, and Worchester in Maryland in the year 2019 according to the US Census Bureau.




Merging with a crosswalk file

Lets say we wanted to know the population at a zipcode level (there is likely already a dataset for this), but utilizing the crosswalk files in this case can provide a little insight into its effectiveness as truth answers are available to compare.

In this case we care about using the county to zip code crosswalk measurements from the year 2019 in the first quarter of the year.

To merge the dataset with its associated crosswalk measurement use:

crosswalk(data = sample, geoid = "county", geoid_col = "county", 
          cw_geoid = "zip", cw_geoid_col = NA, method = NA, year = 2019
          , quarter = 1)

As seen in the data above, each zipcode associated with a county is assigned the same population value.




Merging and applying a crosswalk method

To utilize an allocation method provided by the crosswalk files, specify the method and cw_geoid_col arguments. In this case we want to allocate the county population levels to a zip code level using the method based on the ratio of residential addresses.

crosswalk(data = sample, geoid = "county", geoid_col = "county", 
          cw_geoid = "zip", cw_geoid_col = "pop", method = "res", year = 2019
          , quarter = 1)

As seen in the population field, it is now partitioned out based on the ratio of residential addresses for each zip code that comprises a county.



etam4260/rhud documentation built on Nov. 12, 2022, 2:53 a.m.