R/raincloud.R
In vecmatch: Generalized Propensity Score Estimation and Matching for Multiple Groups

Documented in raincloud

#' @title Examine the imbalance of continuous covariates
#'
#' @description The `raincloud()` function allows to generate distribution plots
#'   for continuous data in an easy and uncomplicated way. The function is based
#'   on the `ggplot2` package, which must already be preinstalled Raincloud
#'   plots consist of three main elements:
#'
#' * Distribution plots, specifically  violin plots with the mean values and
#'   standard deviations of respective groups,
#' * Jittered point plots depicting the underlying distribution of the data in
#'   the rawest form,
#' * Boxplots, summarizing the most important statistics of the underlying
#'   distribution.
#'
#' @param data A non-empty `data.frame` containing at least one numeric column,
#'   as specified by the `y` argument. This argument must be provided and does
#'   not have a default value.
#' @param y A single string or unquoted symbol representing the name of a
#'   numeric column in the `data`. In the vector matching workflow, it is
#'   typically a numeric covariate that requires balancing.
#' @param group A single string or unquoted symbol representing the name of a
#'   factor or character column in `data`. In `raincloud()` plots, the groups
#'   specified by `group` argument will be distinguished by separate `fill` and
#'   `color` aesthetics. For clarity, it is recommended to plot fewer than 10
#'   groups, though there is no formal limit.
#' @param facet A single string or unquoted symbol representing the name of a
#'   variable in `data` to facet by. This argument is used in a call to
#'   [ggplot2::facet_wrap()], creating separate distribution plots for each
#'   unique group in the `facet` variable.
#' @param ncol A single integer. The value should be less than or equal to the
#'   number of unique categories in the `facet` variable. This argument is used
#'   only when `facet` is not NULL, specifying the number of columns in the
#'   [ggplot2::facet_wrap()] call. The distribution plots will be arranged into
#'   the number of columns defined by `ncol`.
#' @param significance A single string specifying the method for calculating
#'   p-values in multiple comparisons between groups defined by the `group`
#'   argument. Significant comparisons are represented by bars connecting the
#'   compared groups on the left side of the boxplots. Note that if there are
#'   many significant tests, the plot size may adjust accordingly. For available
#'   methods refer to the *Details* section. If the `significance` argument is
#'   not `NULL`, standardized mean differences (SMDs) are also calculated and
#'   displayed on the right side of the jittered point plots.
#' @param sig_label_size An integer specifying the size of the significance and
#'   SMD (standardized mean difference) labels displayed on the bars on the
#'   right side of the plot.
#' @param sig_label_color Logical flag. If `FALSE` (default), significance and
#'   SMD bars and text are displayed in the default color (black). If `TRUE`,
#'   colors are applied dynamically based on value: nonsignificant tests and SMD
#'   values below 0.10 are displayed in green, while significant tests and SMD
#'   values of 0.10 or higher are displayed in red.
#' @param smd_type A single string indicating the type of effect size to
#'   calculate and display on the left side of the jittered point plots:
#' * `mean` - Cohen's d is calculated,
#' * `median` - the Wilcoxon effect size (r) is calculated based on the Z
#'   statistic extracted from the Wilcoxon test.
#' @param limits A numeric atomic vector of length two, specifying the `y` axis
#'   limits in the distribution plots. The first element sets the minimum value,
#'   and the second sets the maximum. This vector is passed to the
#'   [ggplot2::xlim()] function to adjust the axis scale.
#' @param jitter A single numeric value between 0 and 1 that controls the amount
#'   of jitter applied to points in the [ggplot2::geom_jitter()] plots. Higher
#'   values of the `jitter` argument produce more jittered plot. It's
#'   recommended to keep this value low, as higher jitter can make the plot
#'   difficult to interpret.
#' @param alpha A single numeric value between 0 and 1 that controls the
#'   transparency of the density plots, boxplots, and jittered point plots.
#'   Lower values result in higher transparency. It is recommended to keep this
#'   value relatively high to maintain the interpretability of the plots when
#'   using the `group` argument, as excessive transparency may cause overlap
#'   between groups, making it difficult to distinguish them visually.
#' @param plot_name A string specifying a valid file name or path for the plot.
#'   If set to `NULL`, the plot is displayed to the current graphical device but
#'   not saved locally. If a valid name with `.png` or `.pdf` extension is
#'   provided, the plot is saved locally. Users can also include a subdirectory
#'   in `plot_name`. Ensure the file path follows the correct syntax for your
#'   operating system.
#' @param overwrite A logical flag (default `FALSE`) that is evaluated only if
#'   the `save.name` argument is provided. If `TRUE`, the function checks
#'   whether a plot with the same name already exists. If it does, the existing
#'   plot will be overwritten. If `FALSE` and a plot with the same name exists,
#'   an error is thrown. If no such plot exists, the plot is saved normally.
#' @param ... Additional arguments passed to the function for calculating
#'   p-values when the `significance` argument is specified. For available
#'   functions associated with different `significance` methods, please refer to
#'   the *Details* section and consult the documentation for the relevant
#'   functions in the `rstatix` package.
#'
#' @details  Available methods for the argument `significance` are:
#'* `"t_test"` - Performs a pairwise comparison using the two-sample t-test,
#'  with the default Holm adjustment for multiple comparisons. This test assumes
#'  normally distributed data and equal variances. The adjustment can be
#'  modified via the `p.adjust.method` argument. The test is implemented via
#'  [rstatix::pairwise_t_test()]
#'* `"dunn_test"` - Executes Dunn's test for pairwise comparisons following a
#'  Kruskal-Wallis test. It is a non-parametric alternative to the t-test when
#'  assumptions of normality or homogeneity of variances are violated.
#'  Implemented via [rstatix::dunn_test()].
#'* `"tukeyHSD_test"` - Uses Tukey's Honest Significant Difference (HSD) test
#'  for pairwise comparisons between group means. Suitable for comparing all
#'  pairs when the overall ANOVA is significant. The method assumes equal
#'  variance between groups and is implemented via [rstatix::tukey_hsd()].
#'* `"games_howell_test"` - A post-hoc test used after ANOVA, which does not
#'  assume equal variances or equal sample sizes. It’s particularly robust for
#'  data that violate homogeneity of variance assumptions. Implemented via
#'  [rstatix::games_howell_test()].
#'* `"wilcoxon_test"` - Performs the Wilcoxon rank-sum test (also known as the
#'  Mann-Whitney U test) for non-parametric pairwise comparisons. Useful when
#'  data are not normally distributed. Implemented via
#'  [rstatix::pairwise_wilcox_test()].
#'
#' @return A `ggplot` object representing the distribution of the `y` variable
#'   across the levels of the `group` and `facet` variables in `data`.
#'
#' @seealso [mosaic()] which summarizes the distribution of discrete data
#'
#' @examples
#' ## Example: Creating a raincloud plot for the ToothGrowth dataset.
#' ## This plot visualizes the distribution of the `len` variable by
#' ## `dose` (using different colors) and facets by `supp`. Group
#' ## differences by `dose` are calculated using a `t_test`, and standardized
#' ## mean differences (SMDs) are displayed through jittered points.
#' library(ggplot2)
#' library(ggpubr)
#'
#' p <- raincloud(ToothGrowth, len, dose, supp,
#'   significance = "t_test",
#'   jitter = 0.15, alpha = 0.4
#' )
#'
#' ## As `p` is a valid `ggplot` object, we can manipulate its
#' ## characteristics usingthe `ggplot2` or `ggpubr` packages
#' ## to create publication grade plot:
#' p <- p +
#'   theme_classic2() +
#'   theme(
#'     axis.line.y = element_blank(),
#'     axis.ticks.y = element_blank()
#'   ) +
#'   guides(fill = guide_legend("Dose [mg]")) +
#'   ylab("Length [cm]")
#'
#' p
#'
#' @export

raincloud <- function(data = NULL,
                      y = NULL,
                      group = NULL,
                      facet = NULL,
                      ncol = 1,
                      significance = NULL,
                      sig_label_size = 2L,
                      sig_label_color = FALSE,
                      smd_type = "mean",
                      limits = NULL,
                      jitter = 0.1,
                      alpha = 0.4,
                      plot_name = NULL,
                      overwrite = FALSE,
                      ...) {
  ############################ INPUT CHECKING###################################

  args_signif <- list(...)

  #--check data frame-----------------------------------------------------------
  if ("matched" %in% class(data)) {
    class(data) <- "data.frame"
  }

  ## must be an object of class data frame
  .check_df(data)

  ## with at least one numeric column
  .chk_cond(
    length(data) == 1 && !is.numeric(data[, 1]),
    "The provided data is not numeric"
  )

  #--check y, group and facet---------------------------------------------------
  ## check if the provided names are valid names + convert to characters
  symlist <- list(
    y = substitute(y),
    group = substitute(group),
    facet = substitute(facet)
  )
  symlist <- .conv_nam(symlist)

  ## check if y exists
  .chk_cond(
    is.null(symlist[[1]]),
    "The argument `y` is missing with no default!"
  )

  ## check if the names are in the provided data.frame
  nonames <- .check_name(data, symlist)
  .chk_cond(
    length(nonames) != 0,
    sprintf("The following names are not present in the
                    provided data frame: %s", word_list(add_quotes(nonames)))
  )

  ## check if limits is a numeric vector of length 2
  .chk_cond(
    !is.null(limits) && !.check_vecl(limits, leng = 2),
    "The `limits` argument should be a numeric vector of
            length 2: c(min, max)"
  )

  ## check range for jitter
  chk::chk_range(jitter, range = c(0, 1))

  ## check range for alpha
  chk::chk_range(alpha, range = c(0, 1))

  ## check logicals
  chk::chk_all(c(overwrite, sig_label_color), chk::chk_flag)

  ## check character and valid name for plot_name
  if (!is.null(plot_name)) {
    chk::chk_character(plot_name)
    .check_extension(plot_name,
      x_name = "plot_name",
      ext_vec = c(".png", ".PNG", ".pdf", ".PDF")
    )
  }

  # check if sig_label_size is integer
  if (!is.null(sig_label_size)) {
    suppressWarnings(sig_label_size <- try(as.integer(sig_label_size)))

    .chk_cond(
      is.na(sig_label_size) || inherits(sig_label_size, "try-error"),
      "`sig_label_size` must be an integer."
    )

    chk::chk_integer(sig_label_size, x_name = "sig_label_size")
  }

  # check smd_type
  chk::chk_character(smd_type)
  chk::chk_length(smd_type, length = 1)
  .chk_cond(
    smd_type %nin% c("mean", "median"),
    'The `smd_type` argument can only take one of the
            following values: "mean", "median".'
  )

  ####################### DATA PROCESSING ######################################
  # assure y is numeric and convert facet, group to factors
  mapply(.conv_data,
    type = list("numeric", "factor", "factor"),
    varname = symlist,
    MoreArgs = list(
      data = data,
      env = environment() ## allows replacing the data without reassignment
    )
  )

  ## use only complete.cases of variables in the function call
  which_use <- unlist(symlist[!vapply(symlist, is.null, logical(1L))])
  complete_sub <- stats::complete.cases(data[, colnames(data) %in% which_use])
  data <- as.data.frame(subset(data, complete_sub))

  # defining levels of facet for the test
  facet_levels <- length(unique(data[, symlist[["facet"]]]))
  facet_levels <- max(facet_levels, 1) # if facet_levels = 0

  # check and process the significance argument
  use_signif <- FALSE
  if (!is.null(significance)) {
    rlang::check_installed(c("rstatix", "ggpubr"))
    use_signif <- TRUE

    # check if more than 2 groups in the data
    .chk_cond(
      length(unique(data[, symlist[["group"]]])) <= 1,
      "It is impossible to compute statistical significance
              tests for only one group. Check your `group` argument."
    )

    # check if significance is a valid implemented method
    .chk_cond(
      !chk::vld_string(significance) || significance %nin%
        names(.sig_methods),
      sprintf(
        "The argument `significance` must be a single string
                      specifying one of the available methods: %s",
        word_list(add_quotes(names(.sig_methods)))
      )
    )

    # build formula for further calculations
    args_signif[["formula"]] <- stats::as.formula(paste0(
      symlist[["y"]],
      " ~ ",
      symlist[["group"]]
    ))

    # perform the tests for rstatix
    # define rstatix funciton used
    func_used <- switch(significance,
      "t_test" = rstatix::pairwise_t_test,
      "dunn_test" = rstatix::dunn_test,
      "tukeyHSD_test" = rstatix::tukey_hsd,
      "games_howell_test" = rstatix::games_howell_test,
      "wilcoxon_test" = rstatix::pairwise_wilcox_test,
      "sign_test" = rstatix::pairwise_sign_test
    )

    # matching args from ...
    args_signif <- match_add_args(
      arglist = args_signif,
      .sig_methods[[significance]]$args_check_fun
    )

    suppressWarnings(args_signif <- Filter(
      function(x) !all(is.na(x)),
      args_signif
    ))

    # correcting pool.sd to logical if default value for t_test
    if (!is.logical(args_signif[["pool.sd"]]) && significance == "t_test") {
      args_signif[["pool.sd"]] <- !args_signif[["paired"]]
    }

    # predefining output list
    test_results <- list()

    # fitting
    for (i in 1:facet_levels) {
      # Subsetting the data
      args_signif[["data"]] <- if (facet_levels == 1) {
        data
      } else {
        subset_cond <- data[, symlist[["facet"]]] ==
          levels(data[, symlist[["facet"]]])[i]
        data[subset_cond, ]
      }

      # modify name of data argument for tukeyHSD
      names(args_signif)[names(args_signif) == "data"] <- {
        ifelse(significance == "tukeyHSD_test", "x", "data")
      }

      ## call the rstatix func to calculate significance levels
      tryCatch(
        {
          test_results[[i]] <- do.call(
            func_used,
            args_signif
          )
        },
        error = function(e) {
          chk::abort_chk(
            strwrap(
              sprintf(
                "There was a problem in estimating the significance levels
            using %s method. It is probably a bug - consider reporting to
            the maintainer. \n
            Error message from `%s`: %s",
                significance,
                as.character(func_used),
                conditionMessage(e)
              ),
              prefix = " ", initial = ""
            ),
            tidy = FALSE
          )
        }
      )

      # calculating original add_xy_position to locate the pvalues
      test_results[[i]] <- rstatix::add_xy_position(test_results[[i]],
        fun = "max",
        stack = FALSE,
        x = symlist[["group"]],
        scales = "fixed"
      )

      # adding effsize
      names(args_signif)[names(args_signif) == "x"] <- "data"

      # dynamically choose the function based on smd_type
      effsize_function <- switch(smd_type,
        "mean" = rstatix::cohens_d,
        "median" = rstatix::wilcox_effsize
      )

      # calculate SMD using the selected function
      smd <- effsize_function(
        args_signif[["data"]],
        args_signif[["formula"]]
      )

      smd <- smd[, c("group1", "group2", "effsize")]

      colnames(smd)[colnames(smd) == "effsize"] <- "smd"

      smd$smd <- abs(round(smd$smd, 2))

      # binding the results together
      test_results[[i]] <- merge(test_results[[i]], smd,
        by = c("group1", "group2"),
        all = TRUE
      )

      # adding the facet var
      if (facet_levels > 1) {
        test_results[[i]] <- cbind(
          facet = rep(
            levels(data[, symlist[["facet"]]])[i],
            dim(test_results[[i]])[1]
          ),
          test_results[[i]]
        )
      }
    }

    # process the resulting list to df
    if (facet_levels == 1) {
      test_results <- as.data.frame(test_results)
    } else {
      test_results <- do.call(rbind, test_results)
    }

    # add color to the plot
    if (sig_label_color) {
      test_results <- cbind(test_results,
        color.pval = as.character(ifelse(test_results[, "p.adj"] < 0.05,
          "#de2d26",
          "#31a354"
        )),
        color.smd = as.character(ifelse(test_results[, "smd"] >= 0.1,
          "#de2d26",
          "#31a354"
        ))
      )
    } else {
      test_results <- cbind(test_results,
        color.pval = "#000000",
        color.smd = "#000000"
      )
    }
  }

  ####################### PLOTTING #############################################
  # define the replace function
  "%+replace%" <- ggplot2::"%+replace%"

  ## Unique values in grouping variables (necessary to define the palette)
  pal_len <- length(unique(data[, symlist[["group"]]]))
  pal_len <- max(pal_len, 1)

  ## redefining 'group' labels to include sample sizes--------------------------
  if (!is.null(symlist[["group"]])) {
    if (is.null(symlist[["facet"]])) {
      freq_table <- as.data.frame(table(data[, symlist[["group"]]]))

      for (i in seq_len(nrow(freq_table))) {
        message <- sprintf(
          "%s (n = %s)",
          freq_table[i, 1],
          freq_table[i, 2]
        )

        levels(data[, symlist[["group"]]])[
          levels(data[, symlist[["group"]]]) == freq_table[i, 1]
        ] <- message
      }
    } else {
      # define the counts
      freq_table <- as.data.frame(table(
        data[, symlist[["facet"]]],
        data[, symlist[["group"]]]
      ))

      tryCatch(
        {
          # pivoting that data.frame to wide based on sex
          freq_table <- stats::reshape(freq_table,
            timevar = "Var1",
            idvar = c("Var2"),
            direction = "wide"
          )
        },
        error = function(e) {
          chk::abort_chk(strwrap(
            "Counts could not be calculated from the provided data frame.
            Please ensure that your data frame contains observations.",
            prefix = " ", initial = ""
          ))
        }
      )


      for (i in seq_len(nrow(freq_table))) {
        message <- sprintf(
          "%s (n = %s, %s)",
          freq_table[i, 1],
          freq_table[i, 2],
          freq_table[i, 3]
        )

        levels(data[, symlist[["group"]]])[
          levels(data[, symlist[["group"]]]) == freq_table[i, 1]
        ] <- message
      }
    }
  } else {
    n_len <- length(data[, symlist[["y"]]])
  }

  #--defining height of jitter plots--------------------------------------------
  rain_height <- 0.1

  ## --defining the main ggplot formula-----------------------------------------
  colnames(data)[which(colnames(data) == symlist["facet"])] <- "facet"
  main <- ggplot2::ggplot(data, ggplot2::aes(
    x = "",
    y = data[, symlist[["y"]]]
  ))

  # defining theme for the subplots
  custom_theme <- ggplot2::theme_classic() %+replace%
    ggplot2::theme(
      axis.line.y = ggplot2::element_blank(),
      axis.text.y = ggplot2::element_blank(),
      axis.title.y = ggplot2::element_blank(),
      legend.title = ggplot2::element_text(face = "bold")
    )

  ## --defining the geom_jitter
  #### CASE 1 - only y, no group, no facet======================================
  main_geom_layers <- if (pal_len == 1 || is.null(symlist[["group"]])) {
    main +
      ## --defining the geom_boxplot
      ggplot2::geom_boxplot(
        width = 0.1, alpha = alpha, show.legend = FALSE, fill = "#005b99",
        position = ggplot2::position_nudge(x = -0.22)
      ) +
      ## --defining the datapoints
      ggplot2::geom_jitter(ggplot2::aes(color = "#005b99"),
        size = 2, alpha = alpha, show.legend = TRUE,
        position = ggplot2::position_jitter(width = jitter)
      ) +
      ## halfs of the violin plots
      geom_flat_violin(
        fill = "#005b99",
        trim = FALSE, alpha = alpha,
        position = ggplot2::position_nudge(x = rain_height + 0.05)
      ) +
      ## --defining the stat_summary
      ggplot2::stat_summary(
        fun.data = mean_ci, show.legend = FALSE,
        position = ggplot2::position_nudge(x = rain_height * 3)
      ) +
      ggplot2::scale_color_identity(
        guide = "legend",
        name = symlist[["y"]],
        labels = sprintf("%s (n = %s)", symlist[["y"]], n_len)
      )
  } else {
    #### CASE 2 - y + group, no facet===========================================
    main +
      ## --defining the geom_boxplot
      ggplot2::geom_boxplot(ggplot2::aes(fill = data[, symlist[["group"]]]),
        width = 0.1, alpha = alpha, show.legend = FALSE,
        position = ggpp::position_dodgenudge(width = 0.2, x = -0.22)
      ) +
      ## --defining the datapoints
      ggplot2::geom_jitter(ggplot2::aes(color = data[, symlist[["group"]]]),
        size = 2, alpha = alpha, show.legend = FALSE,
        position = ggplot2::position_jitterdodge(
          jitter.width = jitter,
          dodge.width = 0.25
        )
      ) +
      ## halfs of the violin plots
      geom_flat_violin(ggplot2::aes(fill = data[, symlist[["group"]]]),
        trim = FALSE, alpha = alpha,
        position = ggplot2::position_nudge(x = rain_height + 0.05)
      ) +
      ## --defining the stat_summary
      ggplot2::stat_summary(ggplot2::aes(color = data[, symlist[["group"]]]),
        fun.data = mean_ci, show.legend = FALSE,
        position = ggpp::position_dodgenudge(x = rain_height * 3, width = 0.1)
      ) +
      ## define the fill lab
      ggplot2::guides(fill = ggplot2::guide_legend(symlist[["group"]])) +

      scale_color_vecmatch(n = pal_len, type = "discrete") +
      scale_fill_vecmatch(n = pal_len, type = "discrete")
  }

  #--defining the ggplot object-------------------------------------------------
  p <- main_geom_layers +
    ## defining scales
    ggplot2::scale_x_discrete(
      name = "",
      expand = c(rain_height * 3.5, 0, 0, 0.62)
    ) +
    ## flipping coordinates
    ggplot2::coord_flip() +
    ## defining theme
    ggplot2::theme_classic() %+replace%
    ggplot2::theme(
      axis.ticks.y = ggplot2::element_blank(),
      axis.line.y = ggplot2::element_blank(),
      legend.title = ggplot2::element_text(face = "bold")
    ) +
    ## define ylabs
    ggplot2::ylab(symlist[["y"]])

  #### CASE 3 - use signif TRUE=================================================
  if (use_signif) {
    ## defining the lims and calculating the significance bars if necessary
    # test run to define the limits of the plot
    violin_test <- ggplot2::ggplot(data, ggplot2::aes(
      x = "",
      y = data[, symlist[["y"]]]
    )) +
      geom_flat_violin(
        trim = FALSE, position = ggplot2::position_nudge(x = -0.5)
      )

    # getting the xlim out of the violin plot
    x_limits <- ggplot2::ggplot_build(violin_test)
    x_limits <- as.vector(x_limits$layout$panel_scales_y[[1]]$range$range)
    range <- abs(x_limits[1] - x_limits[2])
    x_max <- max(data[, symlist[["y"]]]) + range * 0.03
    x_limits[2] <- x_limits[2] + range * 0.06

    # overwrite x_limits if limits defined
    x_limits <- limits %||% x_limits

    # getting the nudged xlim
    p_build <- ggplot2::ggplot_build(p)
    x_boxplot <- p_build$data[[1]][, c("group", "x")]
    x_jitter <- p_build$data[[2]][, c("group", "x")]

    # recalculating the y.positions
    # calculating approximate range, maximum y and y.positions for stat_pvalue
    y_pos_seq <- list()

    # looping along the levels of fcaet to calulacte yposition
    for (i in 1:facet_levels) {
      #### CASE 3.1 - use facet=================================================
      if (facet_levels > 1) {
        test_results_sub <- test_results[
          test_results$facet == unique(levels(data[, "facet"]))[i],
        ]
      } else {
        #### CASE 3.1 - no facet================================================
        test_results_sub <- test_results
      }

      number_comp <- dim(test_results_sub)[1]
      y_pos_seq[[i]] <- seq(
        from = x_max,
        to = x_limits[2],
        length.out = number_comp
      )
    }

    # overwriting y.position
    test_results$y.position <- unlist(y_pos_seq)

    # overwriting xmin and xmax
    test_results$xmin_box <- x_boxplot$x[match(
      test_results$xmin,
      x_boxplot$group
    )]
    test_results$xmax_box <- x_boxplot$x[match(
      test_results$xmax,
      x_boxplot$group
    )]
    test_results$xmin_jit <- x_jitter$x[match(
      test_results$xmin,
      x_jitter$group
    )]
    test_results$xmax_jit <- x_jitter$x[match(
      test_results$xmax,
      x_jitter$group
    )]

    ## adding stat_pvalue
    p <- p +
      ## adding custom pvalues
      ggpubr::stat_pvalue_manual(test_results,
        y.position = "y.position",
        xmin = "xmin_box",
        xmax = "xmax_box",
        label = "p.adj.signif",
        coord.flip = TRUE,
        tip.length = 0.01,
        size = sig_label_size,
        color = rep(test_results[, "color.pval"], each = 3)
      ) +
      ## adding smds
      ggpubr::stat_pvalue_manual(test_results,
        y.position = "y.position",
        xmin = "xmin_jit",
        xmax = "xmax_jit",
        label = "smd",
        coord.flip = TRUE,
        tip.length = 0.01,
        size = sig_label_size,
        color = rep(test_results[, "color.smd"], each = 3)
      )
  }

  ## add theme
  p <- p +
    custom_theme

  # define limits
  if (!is.null(limits)) {
    p <- p +
      ggplot2::ylim(limits)
  }

  #### CASE 4 - add facet=======================================================
  if (!is.null(symlist[["facet"]])) {
    ## custom labeller function to include the number of obs--------------------
    freq_table <- as.data.frame(table(data[, "facet"]))

    labeller_vec <- c()
    for (i in seq_len(nrow(freq_table))) {
      message <- sprintf(
        "%s (n = %s)",
        freq_table[i, 1],
        freq_table[i, 2]
      )

      ## defining named variable
      labeller_vec[i] <- message
      names(labeller_vec)[i] <- as.character(freq_table[i, 1])
    }

    p <- p +
      ggplot2::facet_wrap(. ~ facet,
        ncol = ncol,
        labeller = ggplot2::as_labeller(labeller_vec)
      )
  }

  ## Saving the plot
  if (!is.null(plot_name)) {
    fexist <- file.exists(plot_name)

    if (overwrite || (!fexist && !overwrite)) {
      suppressMessages(ggplot2::ggsave(plot_name,
        plot = p, dpi = 300, create.dir = TRUE
      ))
    } else if (fexist && !overwrite) {
      chk::abort_chk(strwrap(
        "The file specified in the plot_name argument already exists.
        Set overwrite = TRUE to replace the existing file.",
        prefix = " ", initial = ""
      ))
    }
  }

  ## Returning a ggplot object
  return(p)
}
Any scripts or data that you put into this service are public.
vecmatch documentation built on June 8, 2025, 9:36 p.m.
rdrr.io home R language documentation Run R code online
CRAN packages Bioconductor packages R-Forge packages GitHub packages
Note that we can't provide technical support on individual packages. You should contact the package authors for that.
vecmatch
Generalized Propensity Score Estimation and Matching for Multiple Groups

R/raincloud.R
In vecmatch: Generalized Propensity Score Estimation and Matching for Multiple Groups

Defines functions raincloud

Documented in raincloud

Try the vecmatch package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

vecmatch Generalized Propensity Score Estimation and Matching for Multiple Groups

R/raincloud.R In vecmatch: Generalized Propensity Score Estimation and Matching for Multiple Groups

Defines functions raincloud

Documented in raincloud

Try the vecmatch package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

vecmatch
Generalized Propensity Score Estimation and Matching for Multiple Groups

R/raincloud.R
In vecmatch: Generalized Propensity Score Estimation and Matching for Multiple Groups