In hrecht/censusapi: Retrieve Data from the Census APIs

library(censusapi)
knitr::opts_chunk$set(message = FALSE, warning = FALSE)

This package provides basic support for the Census's new microdata APIs, using the same getCensus() functions used for summary data. Getting the data with getCensus() is easy. Using it responsibly takes some homework.

About microdata

Microdata contains individual-level responses: one row per person. It is a vital tool to perform custom analysis, but with great power comes great responsibility. Appropriately weighting the individual-level responses is required. You'll often need to work with household relationships and will need to handle responses that aren't in the universe of the question (for example, removing children in an analysis about college graduation rate.)

If you're new to working with microdata you'll need to do some reading before diving in. Here are some resources from the Census Bureau:

What is microdata and why should I use it? (video and transcript)
Census Microdata API User Guide (pdf)
Microdata API documentation

As for all other endpoints, censusapi retrieves the data so that you can perform your own analysis using your methodology of choice. If you're looking for an interactive microdata analysis tool, try the data.census.gov microdata interactive tool or the IPUMS online data analysis tool.

Once you've learned how to use microdata and gained and understanding of weighting, getting the data using censusapi is simple.

Getting microdata with censusapi

As an example, we'll get data from the 2020 Current Population Survey Voting Supplement. This survey asks people if they voted, how, and when, and includes useful demographic data.

See the available variables:

voting_vars <- listCensusMetadata(
    name = "cps/voting/nov",
    vintage = 2020,
    type = "variables")
head(voting_vars)

From the CPS Voting supplement, get data on method of voting in New York state using PES5 (Vote in person or by mail?) and PESEX (gender), along with the appropriate weighting variable, PWSSWGT. We'll only get data for people with a response of 1 (yes) to PES1 (Did you vote?).

cps_voting <- getCensus(
    name = "cps/voting/nov",
    vintage = 2020,
    vars = c("PES5", "PESEX", "PWSSWGT"),
    region = "state:36",
    PES1 = 1)
head(cps_voting)

Making a data dictionary

Most of microdata variables are encoded, which means that your data will have a lot of numbers instead of text labels.

A data dictionary, which includes the definitions and labels for every variable in the dataset, is helpful. This is possible with listCensusMetasdata(include_values = "TRUE) returns a data dictionary with one row for each variable-label pair. That means if there are 30 codes for a given variable, it will have 30 rows in the data dictionary. Variables that don't have value labels in the metadata will have only one row.

voting_dict <- listCensusMetadata(
    name = "cps/voting/nov",
    vintage = 2020,
    type = "variables",
    include_values = TRUE)
head(voting_dict)

You can also look up the meaning of those codes for a single variable using the same function, listCensusMetadata(). Here are the values of PES5, the variable for "Vote in person or by mail?"

PES5_values <- listCensusMetadata(
    name = "cps/voting/nov",
    vintage = 2020,
    type = "values",
    variable = "PES5")
PES5_values

Other ways to access microdata

The Census Bureau microdata APIs are helpful for working with a limited just-released datasets. But they're not your only option. Some other ways to get microdata are:

Retrieve standardized, cleaned microdata data from IPUMS and import with the impumsr package. IPUMS is widely used in research when the data needed is not brand new. I highly recommend that you check out IPUMS' cleaned files microdata files as well as historic geographic data. These standardized files are generally released months to a year after the raw Census microdata that is available directly from the Census Bureau.
Download complete bulk files from the Census FTPs (file transfer protocols.) This is helpful if you need the a large number of variables. You might run in to size limitations getting many variables through the APIs.
Retrieve American Community Survey microdata via the Census APIs with tidycensus, which has helpful functions for working with those endpoints.

hrecht/censusapi documentation built on June 13, 2025, 8:41 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

Tweet to @rdrrHQ

GitHub issue tracker

ian@mutexlabs.com