data_frame_stack_new: Verify equivalent structure of two dataset

View source: R/data-frame-stack-new.R

data_frame_stack_newR Documentation

Verify equivalent structure of two dataset

Description

These functions help to compare two metadata frames and assess if new rows should be added.

Usage

data_frame_stack_new(
  d_original,
  d_current,
  keys,
  datestamp_update = FALSE,
  datestamp_value = Sys.Date(),
  stat_columns = character(0)
)

metadata_update_file(
  path,
  d_current,
  keys,
  datestamp_update = FALSE,
  datestamp_value = Sys.Date(),
  stat_columns = character(0)
)

Arguments

d_original

A data.frame that serves as the existing metadata file that potentially needs to be updated. Required.

d_current

A data.frame that contains records potentially missing from d_original. Required.

keys

Column names that represent unique combination. character vector. Optional.

datestamp_update

A logical value indicating whether to ignore a column called datestamp. Defaults to FALSE.

datestamp_value

A Date value assigned to the datestamp column for the records in d_current not present in d_original when datestamp_update is TRUE. Defaults to today.

stat_columns

The name(s) of columns containing values to update. These values in d_current with overwrite the values in d_original.

path

Location of the metadata file to potentially updated. Required character vector.

Value

A tibble::tibble that combines d_original with the new records from d_current.

Note

Each dataset is verified to not have more then one row with the same values in the combination of keys

The stat_columns typically contain metrics like 'count' or 'mean' which may become obsolete in d_original. These values are dropped from d_original and replaced by the columns in d_current, after joining on the keys column(s).

Author(s)

Will Beasley

See Also

data_frame_compare_structure()

Examples

library("magrittr")
ds_original <- tibble::tibble(
  x1         = c(1, 3, 4),
  x2         = letters[c(1, 3, 4)],
  x3         = c(11, 13, 14),
  x4         = c(111, 113, 114),
  x5         = c(-11, -13, -14),
 datestamp  = as.Date("2020-01-07")
)

ds_current <- tibble::tibble(
  x1   = c(1:5, 1, 5),
  x2   = c(letters[1:5], "x", "y"),
  x3   = c(11, 12, 13, 14, 15, 11, 15),
  x4   = c(211, 212, 213, 214, 215, 211, 215),
  x5   = c(311, 312, 313, 314, 315, 311, 315),
  datestamp = as.Date(NA)
)

# Basic: append the new records.
data_frame_stack_new(
  d_original       = ds_original,
  d_current        = ds_current,
  keys             = c("x1", "x2")
)

# Wrinkle 1: datestamp the new records.
data_frame_stack_new(
  d_original       = ds_original,
  d_current        = ds_current,
  keys             = c("x1", "x2"),
  datestamp_update = TRUE
)

# Wrinkle 2a: datestamp the new records; update x4.
data_frame_stack_new(
  d_original       = ds_original,
  d_current        = ds_current,
  keys             = c("x1", "x2"),
  datestamp_update = TRUE,
  stat_columns     = c("x4")
)

# Wrinkle 2b: datestamp the new records; update x4 & x5.
data_frame_stack_new(
  d_original       = ds_original,
  d_current        = ds_current,
  keys             = c("x1", "x2"),
  datestamp_update = TRUE,
  stat_columns     = c("x4", "x5")
)

ds_current %>%
  dplyr::anti_join(ds_original, by = c("x1", "x2"))

# Update a file
## Not run: 
{
  path_temp <- tempfile(fileext = ".csv")
  on.exit(unlink(path_temp))
  file.copy(
    system.file("test-data/metadata-original.csv", package = "OuhscMunge"),
    path_temp
  )
}

# Displays 3 rows.
readr::read_csv(path_temp)

metadata_update_file(
  path_temp,
  dplyr::mutate(ds_current, x1 = as.character(x1), x3 = as.character(x3)),
  c("x1", "x2")
)

# Displays 7 rows.
readr::read_csv(path_temp)

## End(Not run)


OuhscBbmc/OuhscMunge documentation built on Dec. 5, 2024, 4:34 a.m.