paste_tbl_grp: Pasting Together Information for Two Groups

View source: R/pretty_output_functions.R

paste_tbl_grpR Documentation

Pasting Together Information for Two Groups

Description

Paste together information, often statistics, from two groups. There are two predefined combinations: mean(sd) and median[min, max], but user may also paste any single measure together.

Usage

paste_tbl_grp(
  data,
  vars_to_paste = "all",
  first_name = "Group1",
  second_name = "Group2",
  sep_val = " vs. ",
  na_str_out = "---",
  alternative = c("two.sided", "less", "greater"),
  digits = 0,
  trailing_zeros = TRUE,
  keep_all = TRUE,
  verbose = FALSE
)

Arguments

data

input dataset. User must use consistent naming throughout, with an underscore to separate the group names from the measures (i.e. Group1_mean and Group2_mean). There also must be two columns with column names that exactly match the input for first_name and second_name (i.e. 'Group1' and 'Group2'), which are used to form the Comparison variable.

vars_to_paste

vector of names of common measures to paste together. Can be the predefined 'median_min_max' or 'mean_sd', or any variable as long as they have matching columns for each group (i.e. Group1_MyMeasure and Group2_MyMeasure). Multiple measures can be requested. Default: "all" will run 'median_min_max' and 'mean_sd', as well as any pairs of columns in the proper format.

first_name

name of first group (string before '_') . Default is 'Group1'.

second_name

name of second group (string before '_'). Default is 'Group2'.

sep_val

value to be pasted between the two measures. Default is ' vs. '.

na_str_out

the character to replace missing values with.

alternative

a character string specifying the alternative hypothesis, must be one of "two.sided" (default), "greater" or "less". Will be used to determine the character to be pasted between the group names (Comparison variable). Specifying "two.sided" will use the sep_val input.

digits

integer indicating the number of decimal places to round to before pasting for numeric variables. Default is 0.

trailing_zeros

logical indicating if trailing zeros should be included (i.e. 0.100 instead of 0.1). Note if set to TRUE output is a character vector.

keep_all

logical indicating if all remaining, unpasted variables in data should be returned with the pasted variables. Default TRUE.

verbose

a logical variable indicating if warnings and messages should be displayed. Default FALSE.

Details

User must use consistant naming throughout, with a underscore to separate the group names from the measures (i.e. Group1_mean and Group2_mean). There also must be columns defining the group names (i.e. Group1 and Group2), which are used to form the Comparison variable.

alternative included as a parameter so the direction can easily be seen in one-sided test. If "two.sided" is selected the value to be pasted between the two group names will be set to sep_val, where "greater" will use " > " and "less" with use " < " as the pasting value.

Value

data.frame with all the pasted values requested. Each name will have '_comparison' at the end of the names (i.e. mean_comparison, median_comparison, ...)

Examples


library(dplyr)
library(tidyr)
data(exampleData_BAMA)

descriptive_stats_by_group <- exampleData_BAMA %>%
  group_by(visitno,antigen) %>%
  summarise(
    Group1 = unique(group[group == 1]), Group2 = unique(group[group == 2]),
    Group1_n = length(magnitude[group == 1]), Group2_n = length(magnitude[group == 2]),
    Group1_mean = mean(magnitude[group == 1]), Group2_mean = mean(magnitude[group == 2]),
    Group1_sd = sd(magnitude[group == 1]), Group2_sd = sd(magnitude[group == 2]),
    Group1_median = median(magnitude[group == 1]), Group2_median = median(magnitude[group == 2]),
    Group1_min = min(magnitude[group == 1]), Group2_min = min(magnitude[group == 2]),
    Group1_max = max(magnitude[group == 1]), Group2_max = max(magnitude[group == 2]),
    .groups = 'drop'
  )

paste_tbl_grp(data = descriptive_stats_by_group, vars_to_paste = 'all', first_name = 'Group1',
              second_name = 'Group2', sep_val = " vs. ", digits = 0, keep_all = TRUE)

paste_tbl_grp(data = descriptive_stats_by_group, vars_to_paste = c("mean", "median_min_max"),
              alternative= "less", keep_all = FALSE)

paste_tbl_grp(data = descriptive_stats_by_group, vars_to_paste = 'all', first_name = 'Group1',
              second_name = 'Group2', sep_val = " vs. ",
              alternative = 'less', digits = 5, keep_all = FALSE)


# Same example wit tidyverse in single pipe


exampleData_BAMA %>%
 mutate(group = paste0("Group", group)) %>%
 group_by(group, visitno, antigen) %>%
 summarise(N = n(), mean = mean(magnitude), sd = sd(magnitude),
           median = median(magnitude), min = min(magnitude),
           max = max(magnitude), q95_fun = quantile(magnitude, 0.95),
           .groups = 'drop') %>%
 pivot_longer(-(group:antigen)) %>% # these three chains create a wide dataset
 unite(temp, group, name) %>%
 pivot_wider(names_from = temp, values_from = value) %>%
 mutate(Group1 = "Group 1", Group2 = "Group 2") %>%
 paste_tbl_grp()


FredHutch/VISCfunctions documentation built on Oct. 14, 2024, 11:33 p.m.