View source: R/api_define_extract.R
| define_extract_agg | R Documentation | 
Define the parameters of an IPUMS aggregate data extract request to be submitted via the IPUMS API.
The IPUMS API currently supports the following aggregate data collections:
Note that not all extract request parameters and options apply to all collections. For a summary of supported features by collection, see the details below and the IPUMS API documentation.
Use get_metadata_catalog() and get_metadata() to browse and identify
data sources for use in an extract definition.
Learn more about the IPUMS API in vignette("ipums-api") and
aggregate data extract definitions in vignette("ipums-api-agg").
define_extract_agg(
  collection,
  description = "",
  datasets = NULL,
  time_series_tables = NULL,
  shapefiles = NULL,
  geographic_extents = NULL,
  breakdown_and_data_type_layout = NULL,
  tst_layout = NULL,
  data_format = NULL
)
| collection | Code for the IPUMS collection represented by this
extract request. Currently,  | 
| description | Description of the extract. | 
| datasets | List of dataset specifications for any
datasets to include in the extract request. Use  | 
| time_series_tables | For NHGIS extracts, list of time series
table specifications for any
time series tables
to include in the extract request. Use  | 
| shapefiles | For NHGIS extracts, names of any shapefiles to include in the extract request. | 
| geographic_extents | For NHGIS extracts, vector of geographic
extents to use for all of the  Use  | 
| breakdown_and_data_type_layout | For NHGIS extracts, the desired
layout of any  
 Required if any  | 
| tst_layout | For NHGIS extracts, the desired layout of all
 
 Required when an extract definition includes any  | 
| data_format | For NHGIS extracts, the desired format of the extract data file. 
 Note that by default,  Required when an extract definition includes any  | 
An NHGIS extract definition (collection = "nhgis") must include at
least one dataset, time series table, or shapefile specification.
Create a dataset specification with ds_spec(). Each dataset
must be associated with a selection of data_tables and geog_levels. Some
datasets also support the selection of years and breakdown_values.
Create an NHGIS time series table specification with tst_spec(). Each time
series table must be associated with a selection of geog_levels and
may optionally be associated with a selection of years.
An IHGIS extract definition (collection = "ihgis") must include a dataset
specification. IHGIS does not support time series table or shapefile
specifications.
Create a dataset specification with ds_spec(). Each dataset must be
associated with a selection of data_tables and tabulation_geographies.
See examples or vignette("ipums-api-agg") for more details about
specifying datasets and time series tables in an aggregate data extract
definition.
An object of class agg_extract containing
the extract definition.
get_metadata_catalog() and get_metadata() to find data to include in
an extract definition.
submit_extract() to submit an extract request for processing.
save_extract_as_json() and define_extract_from_json() to share an
extract definition.
# Extract definition for tables from an NHGIS dataset
# Use `ds_spec()` to create an NHGIS dataset specification
nhgis_extract <- define_extract_agg(
  "nhgis",
  description = "Example NHGIS extract",
  datasets = ds_spec(
    "1990_STF3",
    data_tables = "NP57",
    geog_levels = c("county", "tract")
  )
)
nhgis_extract
# Extract definition for tables from an IHGIS dataset
define_extract_agg(
  "ihgis",
  description = "Example IHGIS extract",
  datasets = ds_spec(
    "KZ2009pop",
    data_tables = c("KZ2009pop.AAA", "KZ2009pop.AAB"),
    tabulation_geographies = c("KZ2009pop.g0", "KZ2009pop.g1")
  )
)
# Use `tst_spec()` to create an NHGIS time series table specification
define_extract_agg(
  "nhgis",
  description = "Example NHGIS extract",
  time_series_tables = tst_spec("CL8", geog_levels = "county"),
  tst_layout = "time_by_row_layout"
)
# To request multiple datasets, provide a list of `ds_spec` objects
define_extract_agg(
  "nhgis",
  description = "Extract definition with multiple datasets",
  datasets = list(
    ds_spec("2014_2018_ACS5a", "B01001", c("state", "county")),
    ds_spec("2015_2019_ACS5a", "B01001", c("state", "county"))
  )
)
# If you need to specify the same table or geographic level for
# many datasets, you may want to make a set of datasets before defining
# your extract request:
dataset_names <- c("2014_2018_ACS5a", "2015_2019_ACS5a")
dataset_spec <- purrr::map(
  dataset_names,
  ~ ds_spec(
    .x,
    data_tables = "B01001",
    geog_levels = c("state", "county")
  )
)
define_extract_agg(
  "nhgis",
  description = "Extract definition with multiple datasets",
  datasets = dataset_spec
)
# You can request datasets, time series tables, and shapefiles in the same
# definition:
define_extract_agg(
  "nhgis",
  description = "Extract with datasets and time series tables",
  datasets = ds_spec("1990_STF1", c("NP1", "NP2"), "county"),
  time_series_tables = tst_spec("CL6", "state"),
  shapefiles = "us_county_1990_tl2008"
)
# Geographic extents are applied to all datasets/time series tables in the
# definition
define_extract_agg(
  "nhgis",
  description = "Extent selection",
  datasets = list(
    ds_spec("2018_2022_ACS5a", "B01001", "blck_grp"),
    ds_spec("2017_2021_ACS5a", "B01001", "blck_grp")
  ),
  geographic_extents = c("010", "050")
)
# Extract specifications can be indexed by name
names(nhgis_extract$datasets)
nhgis_extract$datasets[["1990_STF3"]]
## Not run: 
# Use the extract definition to submit an extract request to the API
submit_extract(nhgis_extract)
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.