knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
pkgload::load_all()
library(scrutiny)

Use scrutiny to implement new consistency tests in R. Consistency tests, such as GRIM, are procedures that check whether two or more summary values can describe the same data.

This vignette shows you the minimal steps required to tap into scrutiny's framework for implementing consistency tests. The key idea is to focus on the core logic of your test and let scrutiny's functions take care of iteration. For an in-depth treatment, see vignette("consistency-tests-in-depth").

1. Single-case

Encode the logic of your test in a simple function that takes single values. It should return TRUE if they are consistent and FALSE if they are not. Its name should end on _scalar, which refers to its single-case nature. Here, I use a mock test without real meaning, called SCHLIM:

schlim_scalar <- function(y, n) {
  y <- as.numeric(y)
  n <- as.numeric(n)
  all(y / 3 > n)
}

schlim_scalar(y = 30, n = 4)
schlim_scalar(y = 2, n = 7)

2. Vectorized

For completeness, although it's not very important in practice --- Vectorize() from base R helps you turn the single-case function into a vectorized one, so that the new function's arguments can have a length greater than 1:

schlim <- Vectorize(schlim_scalar)

schlim(y = 10:15, n = 4)

3. Basic mapper

Next, create a function that tests many values in a data frame, like grim_map() does. Its name should also end on _map. Use function_map() to get this function without much effort:

schlim_map <- function_map(
  .fun = schlim_scalar,
  .reported = c("y", "n"),
  .name_test = "SCHLIM"
)

# Example data:
df1 <- tibble::tibble(y = 16:25, n = 3:12)

schlim_map(df1)

4. audit() method

Use scrutiny's audit() generic to get summary statistics. Write a new function named audit.scr_name_map(), where name is the name of your test in lower-case --- here, schlim.

Within the function body, call audit_cols_minimal(). This enables you to use audit() following the mapper function:

audit.scr_schlim_map <- function(data) {
  audit_cols_minimal(data, name_test = "SCHLIM")
}

df1 %>% 
  schlim_map() %>% 
  audit()

audit_cols_minimal() only provides the most basic summaries. If you like, you can still add summary statistics that are more specific to your test. See, e.g., the Summaries with audit() section in grim_map()'s documentation.

5. Sequence mapper

This kind of mapper function tests hypothetical values around the reported ones, like grim_map_seq() does. Create a sequence mapper by simply calling function_map_seq():

schlim_map_seq <- function_map_seq(
  .fun = schlim_map,
  .reported = c("y", "n"),
  .name_test = "SCHLIM"
)

df1 %>% 
  schlim_map_seq()

Get summary statistics with audit_seq():

df1 %>% 
  schlim_map_seq() %>% 
  audit_seq()

6. Total-n mapper

Suppose you have grouped data but no group sizes are known, only a total sample size:

df2 <- tibble::tribble(
  ~y1, ~y2, ~n,
   84,  37,  29,
   61,  55,  26
)

To tackle this, create a total-n mapper that varies hypothetical group sizes:

schlim_map_total_n <- function_map_total_n(
  .fun = schlim_map,
  .reported = "y",
  .name_test = "SCHLIM"
)

df2 %>% 
  schlim_map_total_n()

Get summary statistics with audit_total_n():

df2 %>% 
  schlim_map_total_n() %>% 
  audit_total_n()


lhdjung/scrutiny documentation built on Sept. 28, 2024, 12:14 a.m.