R/inspect_data_.R

Defines functions inspect_data_cat_as_dichotom inspect_data_categorical inspect_data_dichotomous

Documented in inspect_data_cat_as_dichotom inspect_data_categorical inspect_data_dichotomous

#' @title Validate dichotomous data
#'
#' @description `inspect_data_dichotomous` checks if an object contains data
#' that is eligible to have been generated by a series of Bernoulli trials. This
#' can be useful to validate inputs in user-defined functions.
#'
#' @param data,success  Arbitrary objects. `success` is meant to indicate the
#' value of `data` that corresponds to a success.
#' @param allow_nas Logical value. If `TRUE` then `NA` and `NaN` values in
#' `data` are allowed. If `FALSE`, execution is stopped and an error message is
#' thrown in case there are `NA` or `NaN` values in `data`.
#' @param warning_nas Logical value. If `TRUE` then the presence of `NA` or
#' `NaN` values in `data` generates a warning message. `NA` and `NaN` values
#' pass silently otherwise (if `allow_nas` is set to`TRUE`).
#'
#' @details `inspect_data_dichotomous` conducts a series of tests to check if
#' `data` is eligible to have been generated by a series of Bernoulli trials.
#' Namely, `inspect_data_dichotomous` checks if:
#' * `data` and `success` are `NULL` or empty.
#' * `data` and `success` are atomic and have an eligible data type (logical,
#' integer, double, character).
#' * `data` and `success` have `NA` or `NaN` values.
#' * The number of unique values in `data` and `success` are adequate.
#' * `success` has \code{\link[base]{length}} 1.
#' * `success` is observed in `data`.
#'
#' @return `inspect_data_dichotomous` does not return any output. There are
#' three possible outcomes:
#' * The call is silent if:
#'   * `data` is eligible to have been generated by a series of Bernoulli trials
#'   and there are no `NA` or `NaN` values in `data`.
#'   * `data` is eligible to have been generated by a series of Bernoulli
#'   trials, there are some `NA` or `NaN` values in `data`, `allow_nas` is set
#'   to `TRUE` and `warning_nas` is set to `FALSE`.
#' * An informative warning message is thrown if:
#'   * `data` is eligible to have been generated by a series of Bernoulli trials
#'   and `success` is not observed in `data`.
#'   * `data` is eligible to have been generated by a series of Bernoulli
#'   trials, there are `NA` or `NaN` values in `data` and both `allow_nas` and
#'   `warning_nas` are set to `TRUE`.
#' * An informative error message is thrown and the execution is stopped if:
#'   * `data` is not eligible to have been generated by a series of Bernoulli
#'   trials.
#'   * `data` is eligible to have been generated by a series of Bernoulli
#'   trials, there are some `NA` or `NaN` values in `data` and `allow_nas` is
#'   set to `FALSE`.
#'
#' @seealso
#' * \code{\link[inspector]{inspect_par_bernoulli}} to validate
#' Bernoulli/Binomial proportions.
#' * \code{\link[inspector]{inspect_data_categorical}} and
#' \code{\link[inspector]{inspect_data_cat_as_dichotom}} to validate categorical
#' data.
#' * \code{\link[inspector]{inspect_par_multinomial}} to validate vectors of
#' Multinomial proportions.
#'
#' @examples
#' # Calls that pass silently:
#' x1 <- c(1, 0, 0, 1, 0)
#' x2 <- c(FALSE, FALSE, TRUE)
#' x3 <- c("yes", "no", "yes")
#' x4 <- factor(c("yes", "no", "yes"))
#' x5 <- c(1, 0, 0, 1, 0, NA)
#' inspect_data_dichotomous(x1, success = 1)
#' inspect_data_dichotomous(x2, success = TRUE)
#' inspect_data_dichotomous(x3, success = "yes")
#' inspect_data_dichotomous(x4, success = "yes")
#' inspect_data_dichotomous(x5, success = 1)
#'
#' # Calls that throw an informative warning message:
#' y1 <- c(1, 1, NA, 0, 0)
#' y2 <- c(0, 0)
#' success <- 1
#' try(inspect_data_dichotomous(y1, success = 1, warning_nas = TRUE))
#' try(inspect_data_dichotomous(y2, success = success))
#'
#' # Calls that throw an informative error message:
#' try(inspect_data_dichotomous(NULL, 1))
#' try(inspect_data_dichotomous(c(1, 0), NULL))
#' try(inspect_data_dichotomous(list(1, 0), 1))
#' try(inspect_data_dichotomous(c(1, 0), list(1)))
#' try(inspect_data_dichotomous(numeric(0), 0))
#' try(inspect_data_dichotomous(1, numeric(0)))
#' try(inspect_data_dichotomous(NaN, 1))
#' try(inspect_data_dichotomous(NA, 1))
#' try(inspect_data_dichotomous(c(1, 0), NA))
#' try(inspect_data_dichotomous(c(1, 0), NaN))
#' try(inspect_data_dichotomous(c(1, 0), 2))
#' @export

inspect_data_dichotomous <-
  function(data,
           success,
           allow_nas = TRUE,
           warning_nas = FALSE) {
    inspect_true_or_false(allow_nas)
    inspect_true_or_false(warning_nas)

    data_output_name <- deparse(substitute(data))
    s_output_name <- deparse(substitute(success))

    if (is.null(data)) {
      stop(paste("Invalid argument:", data_output_name, "is NULL."))
    }
    if (is.null(success)) {
      stop(paste("Invalid argument:", s_output_name, "is NULL."))
    }
    if (isFALSE(is.atomic(data))) {
      stop(paste("Invalid argument:", data_output_name, "must be atomic."))
    }
    if (length(data) == 0) {
      stop(paste("Invalid argument:", data_output_name, "is empty."))
    }
    if (any(isFALSE(is.atomic(success)), isFALSE(length(success) == 1))) {
      stop(paste(
        "Invalid argument:",
        s_output_name,
        "must be atomic and have length 1."
      ))
    }
    if (isFALSE(typeof(data) %in% c("logical",
                                    "integer",
                                    "double",
                                    "character"))) {
      stop(
        paste(
          "Invalid argument: the type of",
          data_output_name,
          "must be 'logical', 'integer', 'double' or 'character'."
        )
      )
    }
    if (isFALSE(typeof(success) %in% c("logical",
                                       "integer",
                                       "double",
                                       "character"))) {
      stop(
        paste(
          "Invalid argument: the type of",
          s_output_name,
          "must be 'logical', 'integer', 'double' or 'character'."
        )
      )
    }
    if (is.na(success)) {
      stop(paste("Invalid argument:", s_output_name, "is NA or NaN"))
    }

    data_factor <-
      factor(data, levels = unique(c(levels(factor(
        success
      )), levels(factor(
        unique(data)
      )))))

    if (isTRUE(nlevels(data_factor) > 2)) {
      stop(paste("Invalid argument: there are more than two levels'."))
    }
    if (all(is.na(data))) {
      stop(paste(
        "Invalid argument: all elements of",
        data_output_name,
        "are NA or NaN."
      ))
    }
    if (any(is.na(data))) {
      if (isFALSE(allow_nas)) {
        stop(paste(
          "Invalid argument: there are NA or NaN values in ",
          paste0(data_output_name, ".")
        ))
      } else {
        if (isTRUE(warning_nas)) {
          warning(paste(
            "There are NA or NaN values in",
            paste0(data_output_name, ".")
          ))
        }
      }
    }
    if (isFALSE(success %in% unique(data))) {
      warning(paste(
        s_output_name,
        "not observed in",
        paste0(data_output_name, ".")
      ))
    }
  }

#' @title Validate categorical data
#'
#' @description `inspect_data_categorical` checks if an object contains data
#' that is eligible to have been generated by a Multinomial distribution. This
#' can be useful to validate inputs in user-defined functions.
#'
#' @param data  An arbitrary object.
#' @param allow_nas Logical value. If `TRUE` then `NA` and `NaN` values in
#' `data` are allowed. If `FALSE`, execution is stopped and an error message is
#' thrown in case there are `NA` or `NaN` values in `data`.
#' @param warning_nas Logical value. If `TRUE` then the presence of `NA` or
#' `NaN` values in `data` generates a warning message. `NA` and `NaN` values
#' pass silently otherwise (if `allow_nas` is set to`TRUE`).
#'
#' @details `inspect_data_categorical` conducts a series of tests to check if
#' `data` is eligible to have been generated by a Multinomial distribution.
#' Namely, `inspect_data_categorical` checks if:
#' * `data` is `NULL` or empty.
#' * `data` is atomic and have an eligible data type (logical, integer, double,
#' character).
#' * `data` has `NA` or `NaN` values.
#'
#' @return `inspect_data_categorical` does not return any output. There are
#' three possible outcomes:
#' * The call is silent if:
#'   * `data` is eligible to have been generated by a Multinomial distribution
#'   and there are no `NA` or `NaN` values in `data`.
#'   * `data` is eligible to have been generated by a Multinomial distribution,
#'   there are some `NA` or `NaN` values in `data` and `warning_nas` is set to
#'   `FALSE`.
#' * An informative warning message is thrown if: `data` is eligible to have
#' been generated by a Multinomial distribution, there are some `NA` or `NaN`
#' values in `data` and `warning_nas` is set to `TRUE`.
#' * An informative error message is thrown and the execution is stopped if:
#'   * `data` is not eligible to have been generated by a Multinomial
#'   distribution.
#'   * `data` is eligible to have been generated by a Multinomial distribution,
#'   there are some `NA` or `NaN` values in `data` and `allow_nas` is set to
#'   `TRUE`.
#'
#' @seealso
#' * \code{\link[inspector]{inspect_data_cat_as_dichotom}} to validate
#' categorical data as dichotomous.
#' * \code{\link[inspector]{inspect_par_multinomial}} to validate vectors of
#' Multinomial proportions.
#' * \code{\link[inspector]{inspect_data_dichotomous}} to validate dichotomous
#' data.
#' * \code{\link[inspector]{inspect_par_bernoulli}} to validate
#' Bernoulli/Binomial proportions.
#'
#' @examples
#' # Calls that pass silently:
#' x1 <- c(1, 0, 0, 1, 2)
#' x2 <- c(FALSE, FALSE, TRUE, NA)
#' x3 <- c("yes", "no", "yes", "maybe")
#' x4 <- factor(c("yes", "no", "yes", "maybe"))
#' x5 <- c(1, 0, 0, 1, 0, NA, 2)
#' inspect_data_categorical(x1)
#' inspect_data_categorical(x2)
#' inspect_data_categorical(x3)
#' inspect_data_categorical(x4)
#' inspect_data_categorical(x5)
#' inspect_data_categorical(x5)
#'
#' # Call that throws an informative warning message:
#' y1 <- c(1, 1, NA, 0, 0, 2)
#' try(inspect_data_categorical(y1, warning_nas = TRUE))
#'
#' # Calls that throw an informative error message:
#' z <- c(1, 1, NA, 0, 0, 2)
#' try(inspect_data_categorical(z, allow_nas = FALSE))
#' try(inspect_data_categorical(NULL))
#' try(inspect_data_categorical(list(1, 0)))
#' try(inspect_data_categorical(numeric(0)))
#' try(inspect_data_categorical(NaN))
#' try(inspect_data_categorical(NA))
#' @export

inspect_data_categorical <-
  function(data,
           allow_nas = TRUE,
           warning_nas = FALSE) {
    inspect_true_or_false(allow_nas)
    inspect_true_or_false(warning_nas)

    data_output_name <- deparse(substitute(data))

    if (is.null(data)) {
      stop(paste("Invalid argument:", data_output_name, "is NULL."))
    }
    if (isFALSE(is.atomic(data))) {
      stop(paste("Invalid argument:", data_output_name, "must be atomic."))
    }
    if (length(data) == 0) {
      stop(paste("Invalid argument:", data_output_name, "is empty."))
    }
    if (isFALSE(typeof(data) %in% c("logical",
                                    "integer",
                                    "double",
                                    "character"))) {
      stop(
        paste(
          "Invalid argument: the type of",
          data_output_name,
          "must be 'logical', 'integer', 'double' or 'character'."
        )
      )
    }
    if (all(is.na(data))) {
      stop(paste(
        "Invalid argument: all elements of",
        data_output_name,
        "are NA or NaN."
      ))
    }
    if (any(is.na(data))) {
      if (isFALSE(allow_nas)) {
        stop(paste(
          "Invalid argument: there are NA or NaN values in ",
          paste0(data_output_name, ".")
        ))
      } else {
        if (isTRUE(warning_nas)) {
          warning(paste(
            "There are NA or NaN values in",
            paste0(data_output_name, ".")
          ))
        }
      }
    }
  }

#' @title Validate categorical data as dichotomous
#'
#' @description `inspect_data_cat_as_dichotom` checks if an object contains
#' valid categorical data that is eligible to be used as dichotomous data. This
#' can be useful to validate inputs in user-defined functions.
#'
#' @param data,success  Arbitrary objects. `success` is meant to indicate the
#' value of `data` that corresponds to a success.
#' @param allow_nas Logical value. If `TRUE` then `NA` and `NaN` values in
#' `data` are allowed. If `FALSE`, execution is stopped and an error message is
#' thrown in case there are `NA` or `NaN` values in `data`.
#' @param warning_nas Logical value. If `TRUE` then the presence of `NA` or
#' `NaN` values in `data` generates a warning message. `NA` and `NaN` values
#' pass silently otherwise (if `allow_nas` is set to`TRUE`).
#'
#' @details `inspect_data_cat_as_dichotom` conducts a series of tests to check
#' if `data` contains valid categorical data that is eligible to be used as
#' dichotomous data. Namely, `inspect_data_cat_as_dichotom` checks if:
#' * `data` and `success` are `NULL` or empty.
#' * `data` and `success` are atomic and have an eligible data type (logical,
#' integer, double, character).
#' * `data` and `success` have `NA` or `NaN` values.
#' * `success` has \code{\link[base]{length}} 1.
#' * `success` is observed in `data`.
#'
#' @return `inspect_data_cat_as_dichotom` does not return any output. There are
#' three possible outcomes:
#' * The call is silent if:
#'   * `data` contains valid categorical data that is eligible to be used as
#'   dichotomous data and there are no `NA` or `NaN` values in `data`.
#'   * `data` contains valid categorical data that is eligible to be used as
#'   dichotomous data, there are some `NA` or `NaN` values in `data`,
#'   `allow_nas` is set to `TRUE` and `warning_nas` is set to `FALSE`.
#' * An informative warning message is thrown if:
#'   * `data` contains valid categorical data that is eligible to be used as
#'   dichotomous data and `success` is not observed in `data`.
#'   * `data` contains valid categorical data that is eligible to be used as
#'   dichotomous data, there are `NA` or `NaN` values in `data` and both
#'   `allow_nas` and `warning_nas` are set to `TRUE`.
#' * An informative error message is thrown and the execution is stopped if:
#'   * `data` does not contain valid categorical data that is eligible to be
#'   used as dichotomous data.
#'   * `data` contains valid categorical data that is eligible to be used as
#'   dichotomous data, there are some `NA` or `NaN` values in `data` and
#'   `allow_nas` is set to `FALSE`.
#'
#' @seealso
#' * \code{\link[inspector]{inspect_data_categorical}} to validate categorical.
#' * \code{\link[inspector]{inspect_par_multinomial}} to validate vectors of
#' Multinomial proportions.
#' * \code{\link[inspector]{inspect_data_dichotomous}} to validate dichotomous
#' data.
#' * \code{\link[inspector]{inspect_par_bernoulli}} to validate
#' Bernoulli/Binomial proportions.
#'
#' @examples
#' # Calls that pass silently:
#' x1 <- c(1, 0, 0, 1, 0)
#' x2 <- c(FALSE, FALSE, TRUE)
#' x3 <- c("yes", "no", "yes")
#' x4 <- factor(c("yes", "no", "yes"))
#' x5 <- c(1, 0, 0, 1, 0, NA)
#' inspect_data_cat_as_dichotom(x1, success = 1)
#' inspect_data_cat_as_dichotom(x2, success = TRUE)
#' inspect_data_cat_as_dichotom(x3, success = "yes")
#' inspect_data_cat_as_dichotom(x4, success = "yes")
#' inspect_data_cat_as_dichotom(x5, success = 1)
#'
#' # Calls that throw an informative warning message:
#' y1 <- c(1, 1, NA, 0, 0)
#' y2 <- c(0, 0)
#' success <- 1
#' try(inspect_data_cat_as_dichotom(y1, success = 1, warning_nas = TRUE))
#' try(inspect_data_cat_as_dichotom(y2, success = success))
#'
#' # Calls that throw an informative error message:
#' try(inspect_data_cat_as_dichotom(y1, 1, allow_nas = FALSE))
#' try(inspect_data_cat_as_dichotom(NULL, 1))
#' try(inspect_data_cat_as_dichotom(c(1, 0), NULL))
#' try(inspect_data_cat_as_dichotom(list(1, 0), 1))
#' try(inspect_data_cat_as_dichotom(c(1, 0), list(1)))
#' try(inspect_data_cat_as_dichotom(numeric(0), 0))
#' try(inspect_data_cat_as_dichotom(1, numeric(0)))
#' try(inspect_data_cat_as_dichotom(NaN, 1))
#' try(inspect_data_cat_as_dichotom(NA, 1))
#' try(inspect_data_cat_as_dichotom(c(1, 0), NA))
#' try(inspect_data_cat_as_dichotom(c(1, 0), NaN))
#' try(inspect_data_cat_as_dichotom(c(1, 0), 2))
#' @export

inspect_data_cat_as_dichotom <-
  function(data,
           success,
           allow_nas = TRUE,
           warning_nas = FALSE) {
    inspect_true_or_false(allow_nas)
    inspect_true_or_false(warning_nas)

    data_output_name <- deparse(substitute(data))
    s_output_name <- deparse(substitute(success))

    if (is.null(data)) {
      stop(paste("Invalid argument:", data_output_name, "is NULL."))
    }
    if (is.null(success)) {
      stop(paste("Invalid argument:", s_output_name, "is NULL."))
    }
    if (isFALSE(is.atomic(data))) {
      stop(paste("Invalid argument:", data_output_name, "must be atomic."))
    }
    if (length(data) == 0) {
      stop(paste("Invalid argument:", data_output_name, "is empty."))
    }
    if (any(isFALSE(is.atomic(success)), isFALSE(length(success) == 1))) {
      stop(paste(
        "Invalid argument:",
        s_output_name,
        "must be atomic and have length 1."
      ))
    }
    if (isFALSE(typeof(data) %in% c("logical",
                                    "integer",
                                    "double",
                                    "character"))) {
      stop(
        paste(
          "Invalid argument: the type of",
          data_output_name,
          "must be 'logical', 'integer', 'double' or 'character'."
        )
      )
    }
    if (isFALSE(typeof(success) %in% c("logical",
                                       "integer",
                                       "double",
                                       "character"))) {
      stop(
        paste(
          "Invalid argument: the type of",
          s_output_name,
          "must be 'logical', 'integer', 'double' or 'character'."
        )
      )
    }
    if (all(is.na(data))) {
      stop(paste(
        "Invalid argument: all elements of",
        data_output_name,
        "are NA or NaN."
      ))
    }
    if (is.na(success)) {
      stop(paste("Invalid argument:", s_output_name, "is NA or NaN."))
    }
    if (any(is.na(data))) {
      if (isFALSE(allow_nas)) {
        stop(paste(
          "Invalid argument: there are NA or NaN values in ",
          paste0(data_output_name, ".")
        ))
      } else {
        if (isTRUE(warning_nas)) {
          warning(paste(
            "There are NA or NaN values in",
            paste0(data_output_name, ".")
          ))
        }
      }
    }
    if (isFALSE(success %in% unique(data))) {
      warning(paste(
        s_output_name,
        "not observed in",
        paste0(data_output_name, ".")
      ))
    }
  }

Try the inspector package in your browser

Any scripts or data that you put into this service are public.

inspector documentation built on June 18, 2021, 1:06 a.m.