read_ihgis_codebook | R Documentation |
Read the variable metadata contained in an IHGIS extract into an
ipums_ddi
object.
Because IHGIS variable metadata do not adhere to all the standards of
microdata DDI files, some of the ipums_ddi
fields will not be populated.
This function is marked as experimental while we determine whether there may be a more robust way to standardize codebook reading across IPUMS aggregate data collections.
read_ihgis_codebook(cb_file, tbls_file = NULL, raw = FALSE)
cb_file |
Path to a .zip archive containing an IHGIS extract, an IHGIS
data dictionary ( |
tbls_file |
If |
raw |
If If |
IHGIS extracts store variable and geographic metadata in multiple files:
_datadict.csv
contains the data dictionary with metadata
about the variables included across all files in the extract.
_tables.csv
contains metadata about all IHGIS
tables included in the extract.
_geog.csv
contains metadata about the tabulation geographies included
for any tables in the extract.
_codebook.txt
contains table and variable metadata in human readable
form and contains citation information for IHGIS data.
By default, read_ihgis_codebook()
uses information from all these files and
assumes they exist in the provided extract (.zip) file or directory.
If you have unzipped your IHGIS extract and moved the _tables.csv
file,
you will need to provide the path to that file in the tbls_file
argument.
Certain variable metadata can still be loaded without the _geog.csv
or
_codebook.txt
files. However, if raw = TRUE
, the _codebook.txt
file
must be present in the .zip archive or provided to cb_file
.
If you no longer have access to these files, consider resubmitting the extract request that produced the data.
Note that IHGIS codebooks contain metadata for all the datasets contained
in a given extract. Individual data files from the extract may not contain
all of the variables shown in the output of read_ihgis_codebook()
.
If raw = FALSE
, an ipums_ddi
object with metadata about the variables
contained in the data for the extract associated with the given cb_file
.
If raw = TRUE
, a character vector with one element for each line of the
given cb_file
.
ihgis_file <- ipums_example("ihgis0014.zip")
ihgis_cb <- read_ihgis_codebook(ihgis_file)
# Variable labels and descriptions
ihgis_cb$var_info
# Citation information
ihgis_cb$conditions
# If variable metadata have been lost from a data source, reattach from
# the corresponding `ipums_ddi` object:
ihgis_data <- read_ipums_agg(
ihgis_file,
file_select = matches("AAA_g0"),
verbose = FALSE
)
ihgis_data <- zap_ipums_attributes(ihgis_data)
ipums_var_label(ihgis_data$AAA001)
ihgis_data <- set_ipums_var_attributes(ihgis_data, ihgis_cb)
ipums_var_label(ihgis_data$AAA001)
# Load in raw format
ihgis_cb_raw <- read_ihgis_codebook(ihgis_file, raw = TRUE)
# Use `cat()` to display in the R console in human readable format
cat(ihgis_cb_raw[1:21], sep = "\n")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.