interpolate_pw | R Documentation |
A common use-case when working with time-series small-area Census data is to transfer data from one set of shapes (e.g. 2010 Census tracts) to another set of shapes (e.g. 2020 Census tracts). Population-weighted interpolation is one such solution to this problem that takes into account the distribution of the population within a Census unit to intelligently transfer data between incongruent units.
interpolate_pw(
from,
to,
to_id = NULL,
extensive,
weights,
weight_column = NULL,
weight_placement = c("surface", "centroid"),
crs = NULL
)
from |
The spatial dataset from which numeric attributes will be interpolated to target zones. By default, all numeric columns in this dataset will be interpolated. |
to |
The target geometries (zones) to which numeric attributes will be interpolated. |
to_id |
(optional) An ID column in the target dataset to be retained in the output. For data obtained with tidycensus, this will be |
extensive |
if |
weights |
An input spatial dataset to be used as weights. If the dataset is not of geometry type |
weight_column |
(optional) a column in |
weight_placement |
(optional) One of |
crs |
(optional) The EPSG code of the output projected coordinate reference system (CRS). Useful as all input layers ( |
The approach implemented here is based on Esri's data apportionment algorithm, in which an "apportionment layer" of points (referred to here as the weights
) is used to determine how to weight areas of overlap between origin and target zones. Users must supply a "from" dataset as an sf object (the dataset from which numeric columns will be interpolated) and a "to" dataset, also of class sf, that contains the target zones. A third sf object, the "weights", may be an object of geometry type POINT
or polygons from which points will be derived using sf::st_point_on_surface()
.
An intersection is computed between from
and to
, and a spatial join is computed between the intersection layer and the weights layer, represented as points. A specified weight_column
in weights
will be used to determine the relative influence of each point on the allocation of values between from
and to
; if no weight column is specified, all points will be weighted equally.
The extensive
parameter (logical) should reflect the values being interpolated correctly. If TRUE
, the function returns a weighted sum for each zone. If FALSE
, a weighted mean will be returned. For Census data, extensive = TRUE
should be used for transferring counts / estimated counts between zones. Derived metrics (e.g. population density, percentages, etc.) should use extensive = FALSE
. Margins of error in the ACS will not be transferred correctly with this function, so please use with caution.
A dataset of class sf with the geometries and an ID column from to
(the target shapes) but with numeric attributes of from
interpolated to those shapes.
## Not run:
# Example: interpolating work-from-home from 2011-2015 ACS
# to 2020 shapes
library(tidycensus)
library(tidyverse)
library(tigris)
options(tigris_use_cache = TRUE)
wfh_15 <- get_acs(
geography = "tract",
variables = "B08006_017",
year = 2015,
state = "AZ",
county = "Maricopa",
geometry = TRUE
) %>%
select(estimate)
wfh_20 <- get_acs(
geography = "tract",
variables = "B08006_017",
year = 2020,
state = "AZ",
county = "Maricopa",
geometry = TRUE
)
maricopa_blocks <- blocks(
"AZ",
"Maricopa",
year = 2020
)
wfh_15_to_20 <- interpolate_pw(
from = wfh_15,
to = wfh_20,
to_id = "GEOID",
weights = maricopa_blocks,
weight_column = "POP20",
crs = 26949,
extensive = TRUE
)
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.