View source: R/api_define_extract.R
define_extract_agg | R Documentation |
Define the parameters of an IPUMS aggregate data extract request to be submitted via the IPUMS API.
The IPUMS API currently supports the following aggregate data collections:
Note that not all extract request parameters and options apply to all collections. For a summary of supported features by collection, see the details below and the IPUMS API documentation.
Use get_metadata_catalog()
and get_metadata()
to browse and identify
data sources for use in an extract definition.
Learn more about the IPUMS API in vignette("ipums-api")
and
aggregate data extract definitions in vignette("ipums-api-agg")
.
define_extract_agg(
collection,
description = "",
datasets = NULL,
time_series_tables = NULL,
shapefiles = NULL,
geographic_extents = NULL,
breakdown_and_data_type_layout = NULL,
tst_layout = NULL,
data_format = NULL
)
collection |
Code for the IPUMS collection represented by this
extract request. Currently, |
description |
Description of the extract. |
datasets |
List of dataset specifications for any
datasets to include in the extract request. Use |
time_series_tables |
For NHGIS extracts, list of time series
table specifications for any
time series tables
to include in the extract request. Use |
shapefiles |
For NHGIS extracts, names of any shapefiles to include in the extract request. |
geographic_extents |
For NHGIS extracts, vector of geographic
extents to use for all of the Use |
breakdown_and_data_type_layout |
For NHGIS extracts, the desired
layout of any
Required if any |
tst_layout |
For NHGIS extracts, the desired layout of all
Required when an extract definition includes any |
data_format |
For NHGIS extracts, the desired format of the extract data file.
Note that by default, Required when an extract definition includes any |
An NHGIS extract definition (collection = "nhgis"
) must include at
least one dataset, time series table, or shapefile specification.
Create a dataset specification with ds_spec()
. Each dataset
must be associated with a selection of data_tables
and geog_levels
. Some
datasets also support the selection of years
and breakdown_values
.
Create an NHGIS time series table specification with tst_spec()
. Each time
series table must be associated with a selection of geog_levels
and
may optionally be associated with a selection of years
.
An IHGIS extract definition (collection = "ihgis"
) must include a dataset
specification. IHGIS does not support time series table or shapefile
specifications.
Create a dataset specification with ds_spec()
. Each dataset must be
associated with a selection of data_tables
and tabulation_geographies
.
See examples or vignette("ipums-api-agg")
for more details about
specifying datasets and time series tables in an aggregate data extract
definition.
An object of class agg_extract
containing
the extract definition.
get_metadata_catalog()
and get_metadata()
to find data to include in
an extract definition.
submit_extract()
to submit an extract request for processing.
save_extract_as_json()
and define_extract_from_json()
to share an
extract definition.
# Extract definition for tables from an NHGIS dataset
# Use `ds_spec()` to create an NHGIS dataset specification
nhgis_extract <- define_extract_agg(
"nhgis",
description = "Example NHGIS extract",
datasets = ds_spec(
"1990_STF3",
data_tables = "NP57",
geog_levels = c("county", "tract")
)
)
nhgis_extract
# Extract definition for tables from an IHGIS dataset
define_extract_agg(
"ihgis",
description = "Example IHGIS extract",
datasets = ds_spec(
"KZ2009pop",
data_tables = c("KZ2009pop.AAA", "KZ2009pop.AAB"),
tabulation_geographies = c("KZ2009pop.g0", "KZ2009pop.g1")
)
)
# Use `tst_spec()` to create an NHGIS time series table specification
define_extract_agg(
"nhgis",
description = "Example NHGIS extract",
time_series_tables = tst_spec("CL8", geog_levels = "county"),
tst_layout = "time_by_row_layout"
)
# To request multiple datasets, provide a list of `ds_spec` objects
define_extract_agg(
"nhgis",
description = "Extract definition with multiple datasets",
datasets = list(
ds_spec("2014_2018_ACS5a", "B01001", c("state", "county")),
ds_spec("2015_2019_ACS5a", "B01001", c("state", "county"))
)
)
# If you need to specify the same table or geographic level for
# many datasets, you may want to make a set of datasets before defining
# your extract request:
dataset_names <- c("2014_2018_ACS5a", "2015_2019_ACS5a")
dataset_spec <- purrr::map(
dataset_names,
~ ds_spec(
.x,
data_tables = "B01001",
geog_levels = c("state", "county")
)
)
define_extract_agg(
"nhgis",
description = "Extract definition with multiple datasets",
datasets = dataset_spec
)
# You can request datasets, time series tables, and shapefiles in the same
# definition:
define_extract_agg(
"nhgis",
description = "Extract with datasets and time series tables",
datasets = ds_spec("1990_STF1", c("NP1", "NP2"), "county"),
time_series_tables = tst_spec("CL6", "state"),
shapefiles = "us_county_1990_tl2008"
)
# Geographic extents are applied to all datasets/time series tables in the
# definition
define_extract_agg(
"nhgis",
description = "Extent selection",
datasets = list(
ds_spec("2018_2022_ACS5a", "B01001", "blck_grp"),
ds_spec("2017_2021_ACS5a", "B01001", "blck_grp")
),
geographic_extents = c("010", "050")
)
# Extract specifications can be indexed by name
names(nhgis_extract$datasets)
nhgis_extract$datasets[["1990_STF3"]]
## Not run:
# Use the extract definition to submit an extract request to the API
submit_extract(nhgis_extract)
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.