#' Visualize GRIM test results
#'
#' @description `grim_plot()` visualizes summary data and their mutual GRIM
#' consistency. Call this function only on a data frame that resulted from a
#' call to [`grim_map()`].
#'
#' Consistent and inconsistent value pairs from the input data frame are shown
#' in distinctive colors. By default, consistent value pairs are blue and
#' inconsistent ones are red. These and other parameters of the underlying
#' geoms can be controlled via arguments.
#'
#' The background raster follows the `rounding` argument from the `grim_map()`
#' call (unless any of the plotted mean or proportion values has more than 2
#' decimal places, in which case a gradient is shown, not a raster).
#'
#' @section Background raster: The background raster shows the probability of
#' GRIM-inconsistency for random means or proportions, from 0 (all
#' inconsistent) to the greatest number on the x-axis (all consistent). If the
#' number of decimal places in the inputs -- means or percentages -- is 3 or
#' greater, individual points would be too small to display. In these cases,
#' there will not be a raster but a gradient, showing the overall trend.
#'
#' As any raster only makes sense with respect to one specific number of
#' decimal places, the function will throw an error if these numbers differ
#' among input `x` values (and `show_raster` is `TRUE`). You can avoid the
#' error and force plotting by specifying `digits` as the number of decimal
#' places for which the raster or gradient should be displayed.
#'
#' For 1 or 2 decimal places, the raster will be specific to the rounding
#' procedure. As the raster varies by rounding procedure, it will
#' automatically correspond to the `rounding` argument specified in the
#' preceding [`grim_map()`] call. This works fast because the raster is based
#' on data saved in the package itself, so these data don't need to be
#' generated anew every time the function is called. Inconsistent value sets
#' are marked with dark boxes. All other places in the raster denote
#' consistent value sets. The raster is independent of the data -- it only
#' follows the `rounding` specification in the [`grim_map()`] call and the
#' `digits` argument in `grim_plot()`.
#'
#' Display an "empty" plot, one without empirical test results, by setting
#' `show_data` to `FALSE`. You can then control key parameters of the plot
#' with `digits` and `rounding`.
#'
#' With [`grim_map()`]'s default for `rounding`, `"up_or_down"`, strikingly
#' few values are flagged as inconsistent for sample sizes 40 and 80 (or 4 and
#' 8). This effect disappears if `rounding` is set to any other value (see
#' `vignette("rounding-options")`).
#'
#' The 4/8 leniency effect arises because accepting values rounded either up
#' or down is more careful and conservative than any other rounding procedure.
#' In any case, `grim_plot()` doesn't cause this effect --- it only reveals
#' it.
#'
#' @param data Data frame. Result of a call to [`grim_map()`].
#' @param show_data Logical. If set to `FALSE`, test results from the data are
#' not displayed. Choose this if you only want to show the background raster.
#' You can then control plot parameters directly via the `n`, `digits`, and
#' `rounding` arguments. Default is `TRUE`.
#' @param show_gradient Logical. If the number of decimal places is 3 or
#' greater, should a gradient be shown to signal the overall probability of
#' GRIM inconsistency? Default is `TRUE`.
#' @param digits Integer. Number of decimal places for which the background
#' raster will be generated. Default is `NULL`, in which case the greatest
#' number of decimal places from the means or proportions is used.
#' @param n Integer. Maximal value on the x-axis. Default is `NULL`, in which
#' case `n` becomes `10 ^ digits` (e.g., `100` if `digits` is `2`).
#' @param digits Integer. Only relevant if `show_data` is set to `FALSE`. The
#' plot will then be constructed as it would be for data where all `x` values
#' have this many decimal places. Default is `2`.
#' @param rounding String. Only relevant if `show_data` is set to `FALSE`. The
#' plot will then be constructed as it would be for data rounded in this
#' particular way. Default is `"up_or_down"`.
#' @param show_raster Logical. If `TRUE` (the default), the plot has a
#' background raster.
#' @param color_cons,color_incons Strings. Fill colors of the consistent and
#' inconsistent scatter points. Defaults are `"royalblue1"` (consistent) and
#' `"red"` (inconsistent).
#' @param tile_alpha,tile_size Numeric. Further parameters of the scatter
#' points: opacity and, indirectly, size. Defaults are `1` and `1.5`.
#' @param raster_alpha,raster_color Numeric and string, respectively. Parameters
#' of the background raster: opacity and fill color. Defaults are `1` and
#' `"grey75"`.
#'
#' @include utils.R seq-decimal.R
#'
#' @references Brown, N. J. L., & Heathers, J. A. J. (2017). The GRIM Test: A
#' Simple Technique Detects Numerous Anomalies in the Reporting of Results in
#' Psychology. *Social Psychological and Personality Science*, 8(4), 363–369.
#' https://journals.sagepub.com/doi/10.1177/1948550616673876
#' @return A ggplot object.
#' @export
#'
#' @examples
#' # Call `grim_plot()` following `grim_map()`:
#' pigs1 %>%
#' grim_map() %>%
#' grim_plot()
#'
#' # If you change the rounding procedure
#' # in `grim_map()`, the plot will
#' # follow automatically if there is
#' # a difference:
#' pigs1 %>%
#' grim_map(rounding = "ceiling") %>%
#' grim_plot()
#'
#' # For percentages, the y-axis
#' # label also changes automatically:
#' pigs2 %>%
#' grim_map(percent = TRUE) %>%
#' grim_plot()
grim_plot <- function(data = NULL,
show_data = TRUE,
show_raster = TRUE,
show_gradient = TRUE,
n = NULL,
digits = NULL,
rounding = "up_or_down",
color_cons = "royalblue1",
color_incons = "red",
tile_alpha = 1,
tile_size = 1.5,
raster_alpha = 1,
raster_color = "grey75") {
# Checks ----
inherits_grim <- inherits(data, "scr_grim_map")
inherits_grimmer <- inherits(data, "scr_grimmer_map")
if (!inherits_grim) {
# Issue an alert if any GRIMMER inconsistencies were found in `data`:
if (inherits_grimmer) {
reason <- data$reason[!is.na(data$reason)]
n_grimmer_cases <- stringr::str_detect(reason, "GRIMMER")
n_grimmer_cases <- length(which(n_grimmer_cases))
if (n_grimmer_cases > 0L) {
if (n_grimmer_cases == 1L) {
msg_case_s <- "case was"
msg_incons <- "inconsistency"
} else {
msg_case_s <- "cases were"
msg_incons <- "inconsistencies"
}
if (n_grimmer_cases < length(reason)) {
msg_viz <- "Also visualizing"
} else {
msg_viz <- "Visualizing"
}
cli::cli_alert("{msg_viz} {n_grimmer_cases} GRIMMER {msg_incons}.")
}
} else if (show_data) {
cli::cli_abort(c(
"!" = "`grim_plot()` needs GRIM or GRIMMER test results.",
"x" = "`data` is not the output of `grim_map()`, `grim_map_seq()`, \\
or `grim_map_total_n()`; or of the respective `grimmer_*()` functions.",
"i" = "The only exception is an \"empty\" plot that shows the \\
background raster but no empirical test results. Create such a plot \\
with `show_data = FALSE`."
))
}
}
# Warn the user who passed suitable data to `grim_plot()` but also set
# `show_data` to `FALSE`, thereby defeating the data's purpose here:
if (!show_data && any(inherits_grim, inherits_grimmer)) {
msg_grimmer <- if (inherits_grimmer) {
" and GRIMMER"
} else {
""
}
cli::cli_warn(c(
"Test results are not visualized.",
"!" = "You set `show_data` to `FALSE`, but still passed \\
GRIM{msg_grimmer} test results to `grim_plot()`.",
"!" = "Only the background raster or gradient will be shown, not the \\
tested data."
))
}
# The `digits` argument, if specified, must be a single integer-like
# number because it controls the number of decimal places for which the plot
# will be constructed:
if (!is.null(digits)) {
if (length(digits) != 1L) {
cli::cli_abort(c(
"!" = "`digits` must have length 1 (i.e., be a single number).",
"x" = "It has length {length(digits)}."
))
} else if (!is_whole_number(digits)) {
cli::cli_abort(c(
"!" = "`digits` must be a whole number.",
"x" = "It is {digits}."
))
}
}
# Transformations ----
# In case the user set `show_data` to `FALSE`, a plot without empirical test
# results (blue and/or red dots) will be shown. To this end, the function
# must completely bypass the `data` argument. It does so via creating a
# dummy object by that name:
if (!show_data) {
data <- tibble::tibble(x = "0.00", n = 1, items = 1, consistency = TRUE) %>%
add_class(paste0("scr_rounding_", rounding))
}
if (is.null(digits)) {
digits_x <- decimal_places(data$x) # used to be wrapped in `max()`
if (show_raster) {
if (!all(digits_x[1L] == digits_x)) {
means_percentages <- dplyr::if_else(
inherits(data, "scr_percent_true"),
"Percentages",
"Means"
)
dp_unique <- unique(digits_x)
if (length(dp_unique) <= 3L) {
dp_unique_presented <- sort(dp_unique)
msg_starting_with <- ":"
} else {
dp_unique_presented <- sort(dp_unique)[1:3]
msg_starting_with <- ", starting with"
}
cli::cli_abort(c(
"{means_percentages} must have the same number of decimal places.",
"x" = "There are {length(dp_unique)} unique numbers of decimal \\
places in `x`{msg_starting_with} {dp_unique_presented}.",
"i" = "The background raster is only informative if the number of \\
decimal places is consistent across the \\
{tolower(means_percentages)}.",
"i" = "Avoid this error by plotting {tolower(means_percentages)} \\
separately for each number of decimal places. (Alternatively, you \\
can specify `digits` as the number of decimal places for which \\
the plot should be shown. Be aware that this will not be sensible \\
with regard to all {tolower(means_percentages)}.)"
))
}
}
# The call will only pass the above test if all `x` values have the same
# number of decimal places. Therefore, `digits` can now be determined
# simply by taking the first element; or indeed any other element there
# might be:
digits <- digits_x[1L]
}
data$x <- as.numeric(data$x)
# Preparations ----
p10 <- 10 ^ digits
if (is.null(n)) {
n <- p10
}
frac_unit <- 1 / p10
# By default, a background raster is displayed in the plot:
if (show_raster) {
# For 1 or 2 decimal places, the function selects the appropriate raster
# from among those saved within the package itself:
if (!(digits > 2L)) {
# Check the way `x` values were rounded in the preceding `grim_map()` call
# to prepare selecting the plot background raster:
dc <- class(data)
rounding_id <- dc[stringr::str_detect(dc, "^scr_rounding_")]
rounding_id <- stringr::str_remove(rounding_id, "^scr_rounding_")
# Throw error if the specified rounding option is one of the few for which
# no raster is available:
if (any(
rounding_id == c("up_from", "down_from", "up_from_or_down_from")
)) {
cli::cli_abort(c(
"No background raster available for `rounding = {rounding_id}`",
"i" = "Use a different `rounding` specification within the \\
`grim_map()` call or set `show_raster` to `FALSE` within the \\
`grim_plot()` call."
))
}
# Assemble the names of the appropriate raster components for sample size
# (`raster_n`; x-axis) and the fractional portion of the mean or
# proportion (`raster_frac`, y-axis) from the information provided by the
# number of decimal places and the previously specified rounding
# procedure. First, create two strings by those names...
raster_n <- glue::glue("grim_raster_{digits}_{rounding_id}_n")
raster_frac <- glue::glue("grim_raster_{digits}_{rounding_id}_frac")
# ...and second, parse and evaluate these strings, so that R will identify
# the raster component object by the respective name from among the files
# saved within R/sysdata.R:
raster_n <- eval(rlang::parse_expr(raster_n))
raster_frac <- eval(rlang::parse_expr(raster_frac))
} else {
# For any number of decimal places greater than 2, these two objects are
# assigned the value of zero. This is only pro forma. We still need
# `raster_n` and `raster_frac` because they are referenced several times
# further down to build the plot:
raster_n <- 0
raster_frac <- 0
}
# This data frame will be used for the raster when building the plot:
df_plot <- tibble::tibble(
raster_n = as.numeric(raster_n),
raster_frac = as.numeric(raster_frac)
)
}
# Reduce `x` to its fractional portion:
data_emp <- data %>%
dplyr::mutate(x = x - trunc(x)) %>%
dplyr::rename(frac = x)
if (!show_data) {
data_emp <- data_emp %>%
dplyr::mutate(dplyr::across(everything(), function(x) 0L))
}
# If `percent = TRUE` in the underlying `grim_map()` call, the y-axis label is
# automatically adjusted to reflect the fact that the fractional values are
# percentages (converted to decimal numbers), not means:
mean_percent_label <- if (inherits(data, "scr_percent_true")) {
"% (as decimal)"
} else {
"mean"
}
# Automatically color the boxes of value pairs by whether they are
# GRIM-consistent or not:
color_by_consistency <- dplyr::if_else(
data$consistency,
color_cons,
color_incons
)
# The plot itself ----
# Background raster / gradient:
if (show_raster) {
# With 1 or 2 digits, the function provides a background raster...
p <- ggplot2::ggplot(data = df_plot) +
ggplot2::geom_tile(mapping = ggplot2::aes(
x = .data$raster_n,
y = .data$raster_frac
), alpha = raster_alpha, fill = raster_color) +
ggplot2::theme(
panel.border = ggplot2::element_rect(fill = NA, colour = "grey50"),
panel.background = ggplot2::element_rect(fill = "white", colour = NA),
panel.grid = ggplot2::element_blank()
)
# ... but with more decimal places, individual boxes would be too small to
# display, so we need a gradient instead to simply show the overall trend.
# Boxes are still added pro forma; the call to `geom_tile()` is the same
# as above (except for the `alpha` and `fill` specifications):
if (digits > 2L) {
if (show_gradient) {
gradient <-
grDevices::colorRampPalette(c(raster_color, "white"))(10000)
p <- p +
ggplot2::geom_tile(data = df_plot, mapping = ggplot2::aes(
x = .data$raster_n,
y = .data$raster_frac
)) +
ggplot2::annotation_custom(grid::rasterGrob(
t(gradient),
width = grid::unit(1, "npc"),
height = grid::unit(1, "npc")
))
}
# Keep the y-axis ranging from 0 to 1, even with the gradient in place:
p <- p +
ggplot2::scale_y_continuous(
expand = ggplot2::expansion(add = c(0, 0.01)),
limits = c(0, 1)
)
}
}
if (show_data) {
if (utils::packageVersion("ggplot2") >= "3.4") {
p <- p +
ggplot2::geom_tile(
data = data_emp,
mapping = ggplot2::aes(
x = .data$n,
y = .data$frac
),
alpha = tile_alpha,
# Replaced here:
linewidth = 1,
color = color_by_consistency,
fill = color_by_consistency,
width = tile_size / 2,
height = (frac_unit * tile_size) / 2
)
} else {
p <- p +
ggplot2::geom_tile(
data = data_emp,
mapping = ggplot2::aes(
x = .data$n,
y = .data$frac
),
alpha = tile_alpha,
# Still here:
size = 1,
color = color_by_consistency,
fill = color_by_consistency,
width = tile_size / 2,
height = (frac_unit * tile_size) / 2
)
}
}
if (!(digits > 2L)) {
# Further specifications:
p <- p +
ggplot2::theme(
panel.grid = ggplot2::element_blank()
) +
ggplot2::scale_y_continuous(
breaks = seq(from = 0, to = 1, by = max(0.2, frac_unit)),
expand = ggplot2::expansion(add = c(0.01, 0)),
limits = c(0, 1)
)
# Make the exact x-axis scale specification dependent on whether the plot
# will (by default) show a raster...
p <- p +
ggplot2::scale_x_continuous(
breaks = seq(from = 0, to = n, by = (n / 5)),
expand = ggplot2::expansion(mult = c(0, 0.01))
)
} else {
# ...or a gradient, in which case we need the x-axis must be forced to
# run from 0 to 1 using `limits = c(0, n)`, which is omitted above to remove
# the space between the raster and the y-axis:
p <- p +
ggplot2::scale_x_continuous(
breaks = seq(from = 0, to = n, by = (n / 5)),
expand = ggplot2::expansion(mult = c(0, 0.01)),
limits = c(0, n)
)
}
# Finally, return the plot with axis labels, suppressing unnecessary ggplot2
# warnings:
suppressWarnings(print(
p +
ggplot2::labs(
x = "Sample size",
y = paste("Fractional portion of", mean_percent_label)
)
))
}
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.