read_nhgis | R Documentation |
Read a csv or fixed-width (.dat) file downloaded from the NHGIS extract system.
To read spatial data from an NHGIS extract, use read_ipums_sf()
.
read_nhgis(
data_file,
file_select = NULL,
vars = NULL,
col_types = NULL,
n_max = Inf,
guess_max = min(n_max, 1000),
do_file = NULL,
var_attrs = c("val_labels", "var_label", "var_desc"),
remove_extra_header = TRUE,
verbose = TRUE,
data_layer = deprecated()
)
data_file |
Path to a .zip archive containing an NHGIS extract or a single file from an NHGIS extract. |
file_select |
If |
vars |
Names of variables to include in the output. Accepts a
vector of names or a tidyselect selection.
If |
col_types |
One of
See |
n_max |
Maximum number of lines to read. |
guess_max |
For .csv files, maximum number of lines to use for guessing column types. Will never use more than the number of lines read. |
do_file |
For fixed-width files, path to the .do file associated with
the provided By default, looks in the same path as |
var_attrs |
Variable attributes to add from the codebook (.txt) file included in the extract. Defaults to all available attributes. See |
remove_extra_header |
If This header row is not
usually needed as it contains similar information to that
included in the |
verbose |
Logical controlling whether to display output when loading
data. If Will be overridden by |
data_layer |
The .do file that is included when downloading an NHGIS fixed-width
extract contains the necessary metadata (e.g. column positions and implicit
decimals) to correctly parse the data file. read_nhgis()
uses this
information to parse and recode the fixed-width data appropriately.
If you no longer have access to the .do file, consider resubmitting the extract that produced the data. You can also change the desired data format to produce a .csv file, which does not require additional metadata files to be loaded.
For more about resubmitting an existing extract via the IPUMS API, see
vignette("ipums-api", package = "ipumsr")
.
A tibble
containing the data found in
data_file
read_ipums_sf()
to read spatial data from an IPUMS extract.
read_nhgis_codebook()
to read metadata about an IPUMS NHGIS extract.
ipums_list_files()
to list files in an IPUMS extract.
# Example files
csv_file <- ipums_example("nhgis0972_csv.zip")
fw_file <- ipums_example("nhgis0730_fixed.zip")
# Provide the .zip archive directly to load the data inside:
read_nhgis(csv_file)
# For extracts that contain multiple files, use `file_select` to specify
# a single file to load. This accepts a tidyselect expression:
read_nhgis(fw_file, file_select = matches("ds239"), verbose = FALSE)
# Or an index position:
read_nhgis(fw_file, file_select = 2, verbose = FALSE)
# For CSV files, column types are inferred from the data. You can
# manually specify column types with `col_types`. This may be useful for
# geographic codes, which should typically be interpreted as character values
read_nhgis(csv_file, col_types = list(MSA_CMSAA = "c"), verbose = FALSE)
# Fixed-width files are parsed with the correct column positions
# and column types automatically:
read_nhgis(fw_file, file_select = contains("ts"), verbose = FALSE)
# You can also read in a subset of the data file:
read_nhgis(
csv_file,
n_max = 15,
vars = c(GISJOIN, YEAR, D6Z002),
verbose = FALSE
)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.