compare_groups: Compare groups

View source: R/compare_groups.R

compare_groupsR Documentation

Compare groups


Compares groups by (1) creating histogram by group; (2) summarizing descriptive statistics by group; and (3) conducting pairwise comparisons (t-tests and Mann-Whitney tests).


  data = NULL,
  iv_name = NULL,
  dv_name = NULL,
  sigfigs = 3,
  stats = "basic",
  cohen_d = TRUE,
  cohen_d_w_ci = TRUE,
  adjust_p = "holm",
  bonferroni = NULL,
  mann_whitney = TRUE,
  t_test_stats = TRUE,
  t_test_df_decimals = 1,
  round_p = 3,
  save_as_png = FALSE,
  png_name = NULL,
  xlab = NULL,
  ylab = NULL,
  x_limits = NULL,
  x_breaks = NULL,
  x_labels = NULL,
  width = 5000,
  height = 3600,
  units = "px",
  res = 300,
  layout_matrix = NULL,
  col_names_nicer = TRUE,
  convert_dv_to_numeric = TRUE



a data object (a data frame or a data.table)


name of the independent variable (grouping variable)


name of the dependent variable (measure variable of interest)


number of significant digits to round to


statistics to calculate for each group. If stats = "basic", group size, mean, standard deviation, median, minimum, and maximum will be calculated. If stats = "all", in addition to the aforementioned statistics, standard error, 95% confidence and prediction intervals, skewness, and kurtosis will also be calculated. The stats argument can also be a character vector with types of statistics to calculate. For example, entering stats = c("mean", "median") will calculate mean and median. By default, stats = "basic"


if cohen_d = TRUE, Cohen's d statistics will be included in the pairwise comparison data.table.


if cohen_d_w_ci = TRUE, Cohen's d with 95% CI will be included in the output data.table.


the name of the method to use to adjust p-values. If adjust_p = "holm", the Holm method will be used; if adjust_p = "bonferroni", the Bonferroni method will be used. By default, adjust_p = "holm"


The use of this argument is deprecated. Use the 'adjust_p' argument instead. If bonferroni = TRUE, Bonferroni tests will be conducted for t-tests or Mann-Whitney tests.


if TRUE, Mann-Whitney test results will be included in the pairwise comparison data.table. If FALSE, Mann-Whitney tests will not be performed.


if t_test_stats = FALSE, t-test statistic and degrees of freedom will be excluded in the pairwise comparison data.table.


number of decimals for the degrees of freedom in t-tests (default = 1)


number of decimal places to which to round p-values (default = 3)


if save = "all" or if save = TRUE, the histogram by group, descriptive statistics by group, and pairwise comparison results will be saved as a PNG file.


name of the PNG file to be saved. By default, the name will be "compare_groups_results_" followed by a timestamp of the current time. The timestamp will be in the format, jan_01_2021_1300_10_000001, where "jan_01_2021" would indicate January 01, 2021; 1300 would indicate 13:00 (i.e., 1 PM); and 10_000001 would indicate 10.000001 seconds after the hour.


title of the x-axis for the histogram by group. If xlab = FALSE, the title will be removed. By default (i.e., if no input is given), dv_name will be used as the title.


title of the y-axis for the histogram by group. If ylab = FALSE, the title will be removed. By default (i.e., if no input is given), iv_name will be used as the title.


a numeric vector with values of the endpoints of the x axis.


a numeric vector indicating the points at which to place tick marks on the x axis.


a vector containing labels for the place tick marks on the x axis.


width of the PNG file (default = 4000)


height of the PNG file (default = 3000)


the units for the width and height arguments. Can be "px" (pixels), "in" (inches), "cm", or "mm". By default, units = "px".


The nominal resolution in ppi which will be recorded in the png file, if a positive integer. Used for units other than the default. If not specified, taken as 300 ppi to set the size of text and line widths.


The layout argument for arranging plots and tables using the grid.arrange function.


if col_names_nicer = TRUE, column names will be converted from snake_case to an easier-to-eye format.


logical. Should the values in the dependent variable be converted to numeric for plotting the histograms? (default = TRUE)


if holm = TRUE, the relevant p values will be adjusted using Holm method (also known as the Holm-Bonferroni or Bonferroni-Holm method)


the output will be a list of (1) ggplot object (histogram by group) (2) a data.table with descriptive statistics by group; and (3) a data.table with pairwise comparison results. If save_as_png = TRUE, the plot and tables will be also saved on local drive as a PNG file.


## Not run: 
compare_groups(data = iris, iv_name = "Species", dv_name = "Sepal.Length")
compare_groups(data = iris, iv_name = "Species", dv_name = "Sepal.Length",
x_breaks = 4:8)

## End(Not run)

kim documentation built on May 29, 2024, 5:14 a.m.