tbl_check: Check that the rows and columns of two tables are the same

View source: R/tbl_check.R

tbl_checkR Documentation

Check that the rows and columns of two tables are the same

Description

Checks for differences between object and expected in the following order:

  1. Check table class with tbl_check_class()

  2. Check column names with tbl_check_names()

  3. Check number of rows and columns with tbl_check_dimensions()

  4. Check groups with tbl_check_groups()

  5. Check that each column is the same with tbl_check_column()

If the tables differ

  • tbl_check() returns a list describing the problem

  • tbl_grade() returns a failing grade and informative message with gradethis::fail()

Usage

tbl_check(
  object = .result,
  expected = .solution,
  cols = NULL,
  check_class = TRUE,
  ignore_class = NULL,
  check_names = TRUE,
  check_column_order = FALSE,
  check_dimensions = TRUE,
  check_groups = TRUE,
  check_columns = TRUE,
  check_column_class = check_columns,
  check_column_levels = check_columns,
  check_column_values = check_columns,
  tolerance = sqrt(.Machine$double.eps),
  check_row_order = check_columns,
  env = parent.frame()
)

tbl_grade(
  object = .result,
  expected = .solution,
  cols = NULL,
  max_diffs = 3,
  check_class = TRUE,
  ignore_class = NULL,
  check_names = TRUE,
  check_column_order = FALSE,
  check_dimensions = TRUE,
  check_groups = TRUE,
  check_columns = TRUE,
  check_column_class = check_columns,
  check_column_levels = check_columns,
  check_column_values = check_columns,
  tolerance = sqrt(.Machine$double.eps),
  check_row_order = check_columns,
  env = parent.frame(),
  ...
)

Arguments

object

A data frame to be compared to expected.

expected

A data frame containing the expected result.

cols

[tidy-select]
A selection of columns to compare between object and expected. Differences in other columns will be ignored. If NULL, the default, all columns will be checked.

check_class

[logical(1)]
Whether to check that object and expected have the same classes with tbl_check_class().

ignore_class

[character()]
A vector of classes to ignore when finding differences between object and expected.

If an element is named, differences will only be ignored between the pair of the element and its name. For example, ignore_class = c("integer" = "numeric") will ignore class differences only if object has class integer and expected has class numeric, or vice versa.

If all the classes of expected are included in ignore_class, a class problem will never be returned.

check_names

[logical(1)]
Whether to check that object and expected have the same column names with tbl_check_names().

check_column_order

[logical(1)]
Whether to check that the columns of object are in the same order as expected with tbl_check_names(). Defaults to FALSE.

check_dimensions

[logical(1)]
Whether to check that object and expected have the same number of rows and columns with tbl_check_dimensions().

check_groups

[logical(1)]
Whether to check that object and expected have the same groups with dplyr::group_vars().

check_columns

[logical(1)]
Whether to check that all columns have the same contents with tbl_check_column().

check_column_class

[logical(1)]
Whether to check that each column has the same class in object and expected.

check_column_levels

[logical(1)]
Whether to check that each column has the same factor levels in object and expected.

check_column_values

[logical(1)]
Whether to check that each column has the same values in object and expected.

tolerance

[numeric(1) ≥ 0]
values differences smaller than tolerance are ignored. The default value is close to 1.5e-8.

check_row_order

[logical(1)]
Whether to check that the values in each column are in the same order in object and expected.

env

The environment in which to find .result and .solution.

max_diffs

[numeric(1)]
The maximum number of mismatched values to display in an informative failure message. Passed to tbl_check_names() to determine the number of mismatched column names to display and the n_values argument of tbl_check_column() to determine the number of mismatched column values to display. Defaults to 3.

...

Arguments passed on to gradethis::fail

hint

Include a code feedback hint with the failing message? This argument only applies to fail() and fail_if_equal() and the message is added using the default options of give_code_feedback() and maybe_code_feedback(). The default value of hint can be set using gradethis_setup() or the gradethis.fail.hint option.

encourage

Include a random encouraging phrase with random_encouragement()? The default value of encourage can be set using gradethis_setup() or the gradethis.fail.encourage option.

Value

If there are any issues, a list from tbl_check() or a gradethis::fail() message from tbl_grade(). Otherwise, invisibly returns NULL.

Problems

  1. class: The table does not have the expected classes.

  2. not_table: object does not inherit the data.frame class.

  3. names: The table has column names that are not expected, or is missing names that are expected.

  4. names_order: The table has the same column names as expected, but in a different order.

  5. ncol: The table doesn't have the expected number of columns.

  6. nrow: The table doesn't have the expected number of rows.

  7. groups: The table has groups that are not expected, or is missing groups that are expected.

Additional problems may be produced by tbl_check_column().

Examples

.result <- data.frame(a = 1:10, b = 11:20)
.solution <- tibble::tibble(a = 1:10, b = 11:20)
tbl_check()
tbl_grade()

.result <- tibble::tibble(a = 1:10, b = a, c = a, d = a, e = a, f = a)
.solution <- tibble::tibble(z = 1:10, y = z, x = z, w = z, v = z, u = z)
tbl_check()
tbl_grade()
tbl_grade(max_diffs = 5)
tbl_grade(max_diffs = Inf)

.result <- tibble::tibble(a = 1:10, b = 11:20)
.solution <- tibble::tibble(a = 1:11, b = 12:22)
tbl_check()
tbl_grade()

.result <- tibble::tibble(a = 1:10, b = 11:20)
.solution <- tibble::tibble(a = letters[1:10], b = letters[11:20])
tbl_check()
tbl_grade()

.result <- tibble::tibble(a = 1:10, intermediate = 6:15, b = 11:20)
.solution <- tibble::tibble(a = 1:10, b = 11:20)
tbl_check(cols = any_of(names(.solution)))
tbl_grade(cols = any_of(names(.solution)))

.result <- tibble::tibble(a = 1:10, b = 11:20)
.solution <- tibble::tibble(a = 11:20, b = 1:10)
tbl_check()
tbl_grade()
tbl_grade(max_diffs = 5)
tbl_grade(max_diffs = Inf)

.result <- tibble::tibble(a = 1:10, b = rep(1:2, 5))
.solution <- dplyr::group_by(tibble::tibble(a = 1:10, b = rep(1:2, 5)), b)
tbl_check()
tbl_grade()
tbl_grade(check_groups = FALSE)

rstudio/tblcheck documentation built on March 11, 2023, 5:42 p.m.