SRA functions

View source: R/pretty_output_functions.R

paste_tbl_grp

R Documentation

Pasting Together Information for Two Groups

Description

Paste together information, often statistics, from two groups. There are two predefined combinations: mean(sd) and median[min, max], but user may also paste any single measure together.

Usage

paste_tbl_grp(
  data,
  vars_to_paste = "all",
  first_name = "Group1",
  second_name = "Group2",
  sep_val = " vs. ",
  na_str_out = "---",
  alternative = c("two.sided", "less", "greater"),
  digits = 0,
  trailing_zeros = TRUE,
  keep_all = TRUE,
  verbose = FALSE
)

Arguments

`data`	input dataset. User must use consistent naming throughout, with an underscore to separate the group names from the measures (i.e. `Group1_mean` and `Group2_mean`). There also must be two columns with column names that exactly match the input for `first_name` and `second_name` (i.e. 'Group1' and 'Group2'), which are used to form the `Comparison` variable.
`vars_to_paste`	vector of names of common measures to paste together. Can be the predefined 'median_min_max' or 'mean_sd', or any variable as long as they have matching columns for each group (i.e. Group1_MyMeasure and Group2_MyMeasure). Multiple measures can be requested. Default: "all" will run 'median_min_max' and 'mean_sd', as well as any pairs of columns in the proper format.
`first_name`	name of first group (string before '_') . Default is 'Group1'.
`second_name`	name of second group (string before '_'). Default is 'Group2'.
`sep_val`	value to be pasted between the two measures. Default is ' vs. '.
`na_str_out`	the character to replace missing values with.
`alternative`	a character string specifying the alternative hypothesis, must be one of "two.sided" (default), "greater" or "less". Will be used to determine the character to be pasted between the group names (`Comparison` variable). Specifying "two.sided" will use the `sep_val` input.
`digits`	integer indicating the number of decimal places to round to before pasting for numeric variables. Default is 0.
`trailing_zeros`	logical indicating if trailing zeros should be included (i.e. 0.100 instead of 0.1). Note if set to TRUE output is a character vector.
`keep_all`	logical indicating if all remaining, unpasted variables in `data` should be returned with the pasted variables. Default TRUE.
`verbose`	a logical variable indicating if warnings and messages should be displayed. Default FALSE.

Details

User must use consistant naming throughout, with a underscore to separate the group names from the measures (i.e. Group1_mean and Group2_mean). There also must be columns defining the group names (i.e. Group1 and Group2), which are used to form the Comparison variable.

alternative included as a parameter so the direction can easily be seen in one-sided test. If "two.sided" is selected the value to be pasted between the two group names will be set to sep_val, where "greater" will use " > " and "less" with use " < " as the pasting value.

Value

data.frame with all the pasted values requested. Each name will have '_comparison' at the end of the names (i.e. mean_comparison, median_comparison, ...)

Examples


library(dplyr)
library(tidyr)
data(exampleData_BAMA)

descriptive_stats_by_group <- exampleData_BAMA %>%
  group_by(visitno,antigen) %>%
  summarise(
    Group1 = unique(group[group == 1]), Group2 = unique(group[group == 2]),
    Group1_n = length(magnitude[group == 1]), Group2_n = length(magnitude[group == 2]),
    Group1_mean = mean(magnitude[group == 1]), Group2_mean = mean(magnitude[group == 2]),
    Group1_sd = sd(magnitude[group == 1]), Group2_sd = sd(magnitude[group == 2]),
    Group1_median = median(magnitude[group == 1]), Group2_median = median(magnitude[group == 2]),
    Group1_min = min(magnitude[group == 1]), Group2_min = min(magnitude[group == 2]),
    Group1_max = max(magnitude[group == 1]), Group2_max = max(magnitude[group == 2]),
    .groups = 'drop'
  )

paste_tbl_grp(data = descriptive_stats_by_group, vars_to_paste = 'all', first_name = 'Group1',
              second_name = 'Group2', sep_val = " vs. ", digits = 0, keep_all = TRUE)

paste_tbl_grp(data = descriptive_stats_by_group, vars_to_paste = c("mean", "median_min_max"),
              alternative= "less", keep_all = FALSE)

paste_tbl_grp(data = descriptive_stats_by_group, vars_to_paste = 'all', first_name = 'Group1',
              second_name = 'Group2', sep_val = " vs. ",
              alternative = 'less', digits = 5, keep_all = FALSE)


# Same example wit tidyverse in single pipe


exampleData_BAMA %>%
 mutate(group = paste0("Group", group)) %>%
 group_by(group, visitno, antigen) %>%
 summarise(N = n(), mean = mean(magnitude), sd = sd(magnitude),
           median = median(magnitude), min = min(magnitude),
           max = max(magnitude), q95_fun = quantile(magnitude, 0.95),
           .groups = 'drop') %>%
 pivot_longer(-(group:antigen)) %>% # these three chains create a wide dataset
 unite(temp, group, name) %>%
 pivot_wider(names_from = temp, values_from = value) %>%
 mutate(Group1 = "Group 1", Group2 = "Group 2") %>%
 paste_tbl_grp()

FredHutch/VISCfunctions documentation built on Oct. 14, 2024, 11:33 p.m.