vardist)

Description Usage Arguments Value See Also

Typically you have a data set whose integrity is unknown, and you want to compare it to a data set whose reliability has already been established by other means. With this function, you can compare the uncertain data set (the "challenger") to the certain one (the "baseline") and see if they have similar enough distributions.

compare_numeric_distributions(challenger, baseline,
  summaries = default_numeric_summaries,
  tests = default_numeric_tests(tolerance =
  getOption("vardist.numeric_summary_tolerance", 0.1), ks_test_threshold =
  getOption("vardist.ks_test_threshold", 0.5)), parallel = FALSE,
  mc.cores = parallel::detectCores())

`challenger`	data.frame.
`baseline`	data.frame.
`summaries`	list. A named list of summary functions. Each function must take as an input one numeric vector, and output a numeric vector of length 1.
`tests`	list. A named list of functions that return `TRUE` or `FALSE`, and take in columns or summary statistics fom the `challenger` and the `baseline`.
`parallel`	logical. Should we use `mclapply` instead of `lapply`?
`mc.cores`	numeric. To be passed into `mclapply`.

a list with three data frames - the columnwise summaries for the challenger, the columnwise summaries for the baseline, and a report with the results of the tests.

calculate_summaries, generate_report

avantoss/vardist documentation built on May 24, 2019, 3:03 a.m.