remap_data_names: Use a data map to select, rename, adjust and align columns

View source: R/remap_data_names.R

remap_data_namesR Documentation

Use a data map to select, rename, adjust and align columns

Description

Useful to prepare data from several different data sources into a common structure that can be read collectively via arrow::open_dataset()

Usage

remap_data_names(
  this_name,
  df_to_remap,
  data_map = NULL,
  out_file = NULL,
  exclude_cols = c("order", "epsg", "desc", "data_name_use", "url"),
  add_month = !is.null(data_map),
  add_year = !is.null(data_map),
  add_occ = !is.null(data_map),
  occ_cols = c("occ_derivation", "quantity"),
  absences = c("0", "none detected", "none observed", "None detected", "ABSENT"),
  previous = c("delete", "move"),
  compare_previous = TRUE,
  compare_cols = c("data_name", "survey"),
  ...
)

Arguments

this_name

Character. Name of the data source.

df_to_remap

Dataframe containing the columns to select and (potentially) rename

data_map

Dataframe or NULL. Mapping of fields to retrieve. See example envImport::data_map

out_file

Character. Name of file to save. If NULL, this will be here::here("ds", this_name, "this_name.parquet")

add_month, add_year

Logical. Add a year and/or month column to returned data frame (requires a date field to be specified by data_map)

add_occ

Logical. Make an occ column (occurrence) of 1 = detected, 0 = not detected? Due to the plethora of ways original data sets record numbers and absences this should not be considered 100% reliable.

absences

Character. If add_occ what values are considered absences?

previous

Character. What to do with any previous out_file. Default is 'delete'. Alternative 'move' will rename to the same location as gsub("\.parquet", paste0("moved__", format(now(), "%Y%m%d_%H%M%S"), ".parquet"), out_file)

compare_previous

Logical. If TRUE a comparison of records per compare_cols will be made between the new and previous out_file. Ignored unless ⁠previous == "move⁠

compare_cols

If compare_previous which columns to comapare. Default is survey.

...

Not used

exclude_names

Character. column names in namesmap to exclude from the combined data

Details

Includes code from the stack exchange network post by Dan.

Value

Tibble with selected, renamed, adjusted and aligned columns

See Also

Other Help with combining data sources: get_data()


Acanthiza/envImport documentation built on April 14, 2025, 6:17 a.m.