get_rnhgis_tst: Retrieve NHGIS time series tables with caching and lookup...

View source: R/get_rnhgis.R

get_rnhgis_tstR Documentation

Retrieve NHGIS time series tables with caching and lookup table generation.

Description

This function retrieves time series tables from NHGIS (National Historical Geographic Information System) using the 'ipumsr' package, with caching capabilities to avoid redundant downloads. It also generates a lookup table containing metadata about the dataset.

Usage

get_rnhgis_tst(
  ...,
  lkp = FALSE,
  refresh = FALSE,
  save_dir = here::here("data-raw/rnhgis_tst/")
)

Arguments

...

Arguments to be passed to 'ipumsr::tst_spec()', specifying the time series tables to retrieve.

lkp

Logical. If 'TRUE', returns the lookup table; if 'FALSE' (default), returns the dataset.

refresh

Logical. If 'TRUE', forces a refresh of the cached; if 'FALSE' (default), uses the cached data.

save_dir

Directory where downloaded data and lookup tables are saved. Defaults to "data-raw/rnhgis_tst/".

Details

This function first checks if a cached parquet file exists for the specified time series tables. If it does, and 'lkp' is 'FALSE', the cached dataset is returned. If 'lkp' is 'TRUE', the cached lookup table is returned. If the cached file does not exist, the function downloads the data from NHGIS using the 'ipumsr' package. It requires an NHGIS API key to be set as an environment variable named "IPUMS_API_KEY". The downloaded data and the generated lookup table are then saved as parquet files in the specified 'save_dir'.

The lookup table contains information about each variable in the dataset, including its name, type, and attributes.

Value

A data frame (if 'lkp = FALSE') or a lookup table (if 'lkp = TRUE') containing the requested NHGIS time series data.

Examples

## Not run: 
## Example to get places pop time series from NHGIS
tst <- ipumsr::get_metadata_nhgis(type = "time_series_tables") %>%
   setDT()

## Find which time series tables have a place `geog_level`
## and get the total population variable
tst[sapply(geog_levels, function(x) any(grepl("place", x)))] %>%
  .[grepl("Total Population", description)]

tst_spec(name = "AV0", geog_levels = "place")

place_pop_tst <- get_rnhgis_tst(name = "AV0", geog_levels = "place")

place_pop_tst_lkp <- get_rnhgis_tst(name = "AV0", geog_levels = "place", lkp = TRUE)

## End(Not run)


ChandlerLutz/CLmisc documentation built on Feb. 28, 2025, 10:05 p.m.