get_data_pums | R Documentation |
Retrieves American Community Survey (ACS) Public Use Microdata Sample (PUMS) data from storage. Can return person-level, household-level, or combined records with appropriate survey weights applied.
get_data_pums(cols = NULL, year = NULL, kingco = TRUE, records = "person")
cols |
Character vector specifying which columns to include in the
returned data. If NULL, all columns will be included. Note that survey weight
columns (wgtp/pwgtp) and chi_year are always included regardless of selection.
Defaults to |
year |
Integer vector specifying which years to include in the data.
Can be either a single year for 1-year estimates or five consecutive years
for 5-year estimates. If NULL, the most recent single year available will be used.
Note that 2020 is not available due to COVID-19 pandemic survey disruptions.
Defaults to |
kingco |
Logical indicating whether to restrict the data to King County
records only. Defaults to |
records |
Character string specifying whether to return person-level,
household-level, or combined records. Must be one of "person", "household",
or "combined". When 'combined' is selected, person and household records are
merged using the household identifier (serialno) and survey set for
person-level analyses. Defaults to |
The function automatically applies the appropriate survey weights (person or
household) based on the records
parameter. For person-level and
combined records, it uses the person weight (pwgtp) and its replicate weights.
For household-level records, it uses the household weight (wgtp) and its
replicate weights.
The function uses the JK1 (jackknife) method for variance estimation with 80 replicate weights, following Census Bureau recommendations for PUMS data.
When you select records = "combined"
, household-level variables with
the same names as person-level variables are given a '_hh' suffix to
distinguish them. You are strongly encouraged to review the Census Bureau's
ACS PUMS documentation
if you plan to set records = "combined"
.
Returns a survey-weighted dtsurvey
/data.table object
with the specified columns and years that is ready for use with
calc
.
For information regarding the ACS PUMS ETL process, file locations, data dictionaries, etc., see: https://github.com/PHSKC-APDE/svy_acs
# Get person-level data for specific columns from the most recent year
pums_person <- get_data_pums(
cols = c("agep", "race4"),
kingco = TRUE
)
# Get household-level data for a 5-year period
pums_households <- get_data_pums(
year = 2018:2022,
records = "household"
)
# Get combined person-household level data for WA State in 2022
pums_combo <- get_data_pums(
year = 2022,
records = "combined",
kingco = FALSE
)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.