census_helper_new: Census helper function.

View source: R/census_helper_v2.R

census_helper_newR Documentation

Census helper function.

Description

census_helper_new links user-input dataset with Census geographic data.

Usage

census_helper_new(
  key = Sys.getenv("CENSUS_API_KEY"),
  voter.file,
  states = "all",
  geo = c("tract", "block", "block_group", "county", "place", "zcta"),
  age = FALSE,
  sex = FALSE,
  year = "2020",
  census.data = NULL,
  retry = 3,
  use.counties = FALSE,
  skip_bad_geos = FALSE
)

Arguments

key

A character string containing a valid Census API key, which can be requested from the U.S. Census API key signup page.

By default, attempts to find a census key stored in an environment variable named CENSUS_API_KEY.

voter.file

An object of class data.frame. Must contain field(s) named county, tract, block, and/or place specifying geolocation. These should be character variables that match up with U.S. Census categories. County should be three characters (e.g., "031" not "31"), tract should be six characters, and block should be four characters. Place should be five characters if it is included.

states

A character vector specifying which states to extract Census data for, e.g. c("NJ", "NY"). Default is "all", which extracts Census data for all states contained in user-input data.

geo

A character object specifying what aggregation level to use. Use "county", "tract", "block", or "place". Default is "tract". Warning: extracting block-level data takes very long.

age

A TRUE/FALSE object indicating whether to condition on age or not. If FALSE (default), function will return Pr(Geolocation | Race). If TRUE, function will return Pr(Geolocation, Age | Race). If sex is also TRUE, function will return Pr(Geolocation, Age, Sex | Race).

sex

A TRUE/FALSE object indicating whether to condition on sex or not. If FALSE (default), function will return Pr(Geolocation | Race). If TRUE, function will return Pr(Geolocation, Sex | Race). If age is also TRUE, function will return Pr(Geolocation, Age, Sex | Race).

year

A character object specifying the year of U.S. Census data to be downloaded. Use "2010", or "2020". Default is "2020".

census.data

A optional census object of class list containing pre-saved Census geographic data. Can be created using get_census_data function. If census.data is provided, the year element must have the same value as the year option specified in this function (i.e., "2010" in both or "2020" in both). If census.data is provided, the age and the sex elements must be FALSE. This corresponds to the defaults of census_geo_api. If census.data is missing, Census geographic data will be obtained via Census API.

retry

The number of retries at the census website if network interruption occurs.

use.counties

A logical, defaulting to FALSE. Should census data be filtered by counties available in census.data?

skip_bad_geos

Logical. Option to have the function skip any geolocations that are not present in the census data, returning a partial data set. Default is set to FALSE, which case it will break and provide error message with a list of offending geolocations.

Details

This function allows users to link their geocoded dataset (e.g., voter file) with U.S. Census data (2010 or 2020). The function extracts Census Summary File data at the county, tract, block, or place level. Census data calculated are Pr(Geolocation | Race) where geolocation is county, tract, block, or place.

Value

Output will be an object of class data.frame. It will consist of the original user-input data with additional columns of Census data.

Examples


## Not run: census_helper_new(voter.file = voters, states = "nj", geo = "block")
## Not run: census_helper_new(voter.file = voters, states = "all", geo = "tract")
## Not run: census_helper_new(voter.file = voters, states = "all", geo = "place",
 year = "2020")
## End(Not run)


wru documentation built on May 29, 2024, 9:46 a.m.