| interpolate_pw | R Documentation |
A common use-case when working with time-series small-area Census data is to transfer data from one set of shapes (e.g. 2010 Census tracts) to another set of shapes (e.g. 2020 Census tracts). Population-weighted interpolation is one such solution to this problem that takes into account the distribution of the population within a Census unit to intelligently transfer data between incongruent units.
interpolate_pw(
from,
to,
to_id = NULL,
extensive,
weights,
weight_column = NULL,
weight_placement = c("surface", "centroid"),
crs = NULL
)
from |
The spatial dataset from which numeric attributes will be interpolated to target zones. By default, all numeric columns in this dataset will be interpolated. |
to |
The target geometries (zones) to which numeric attributes will be interpolated. |
to_id |
(optional) An ID column in the target dataset to be retained in the output. For data obtained with tidycensus, this will be |
extensive |
if |
weights |
An input spatial dataset to be used as weights. If the dataset is not of geometry type |
weight_column |
(optional) a column in |
weight_placement |
(optional) One of |
crs |
(optional) The EPSG code of the output projected coordinate reference system (CRS). Useful as all input layers ( |
The approach implemented here is based on Esri's data apportionment algorithm, in which an "apportionment layer" of points (referred to here as the weights) is used to determine how to weight areas of overlap between origin and target zones. Users must supply a "from" dataset as an sf object (the dataset from which numeric columns will be interpolated) and a "to" dataset, also of class sf, that contains the target zones. A third sf object, the "weights", may be an object of geometry type POINT or polygons from which points will be derived using sf::st_point_on_surface().
An intersection is computed between from and to, and a spatial join is computed between the intersection layer and the weights layer, represented as points. A specified weight_column in weights will be used to determine the relative influence of each point on the allocation of values between from and to; if no weight column is specified, all points will be weighted equally.
The extensive parameter (logical) should reflect the values being interpolated correctly. If TRUE, the function returns a weighted sum for each zone. If FALSE, a weighted mean will be returned. For Census data, extensive = TRUE should be used for transferring counts / estimated counts between zones. Derived metrics (e.g. population density, percentages, etc.) should use extensive = FALSE. Margins of error in the ACS will not be transferred correctly with this function, so please use with caution.
A dataset of class sf with the geometries and an ID column from to (the target shapes) but with numeric attributes of from interpolated to those shapes.
## Not run:
# Example: interpolating work-from-home from 2011-2015 ACS
# to 2020 shapes
library(tidycensus)
library(tidyverse)
library(tigris)
options(tigris_use_cache = TRUE)
wfh_15 <- get_acs(
geography = "tract",
variables = "B08006_017",
year = 2015,
state = "AZ",
county = "Maricopa",
geometry = TRUE
) %>%
select(estimate)
wfh_20 <- get_acs(
geography = "tract",
variables = "B08006_017",
year = 2020,
state = "AZ",
county = "Maricopa",
geometry = TRUE
)
maricopa_blocks <- blocks(
"AZ",
"Maricopa",
year = 2020
)
wfh_15_to_20 <- interpolate_pw(
from = wfh_15,
to = wfh_20,
to_id = "GEOID",
weights = maricopa_blocks,
weight_column = "POP20",
crs = 26949,
extensive = TRUE
)
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.