get_2c_comp: Perform automated comparisons of continuous variables between...

View source: R/get_2c_comp.R

get_2c_compR Documentation

Perform automated comparisons of continuous variables between two levels of a categorical variable

Description

get_2c_comp() performs automated comparisons between the two levels of a categorical variable on a set of continuous variables.

Usage

get_2c_comp(
  tibble,
  grp = NULL,
  dep_var,
  comp_nm,
  comp_lvl1,
  comp_lvl2,
  paired = FALSE,
  pairing_key = NULL,
  FDR = 0.2,
  grp_as_label = FALSE,
  base_size = 12,
  multi_diff = 1
)

Arguments

tibble

a tibble.

grp

a string indicating the column which contains the names of the continuous variables.

dep_var

a string indicating the column which contains the values of the continuous variables.

comp_nm

a string indicating the column which contains the values of the categorical variable.

comp_lvl1

a string providing the first level of the categorical variable.

comp_lvl2

a string providing the second level of the categorical variable.

paired

a logical indicating whether the samples are independent or paired, default is FALSE.

pairing_key

if paired = TRUE, a string indicating the column to use as pairing key. Default is NULL.

FDR

a numeric indicating the q-value threshold to use for FDR.

grp_as_label

a logical indicating if the names of the continuous variables should be used to label the y axis on the plots. If FALSE, the value of the dep_var argument is systematically used as label. Default is FALSE.

base_size

a numeric provided to the base_size argument of theme_pubr() for plotting.

multi_diff

a numeric provided to the nudge_y argument of geom_text(). The higher the multi_diff value, the greater the distance between the p value label and the data points on the graph. Default is 1 (no adjustment).

Details

For independent samples: normality is tested independently on the two levels of the categorical variable with Shapiro–Wilk test (sample size must be >= 3). Homoscedasticity is tested with Bartlett's test in case of normality, or modified Levene's test in case of non-normality. Comparisons between the two levels of the categorical variable are performed with independent Student's t-test for normal/homoscedastic data, independent Welch's t-test for normal/heteroscedastic data, or Mann-Whitney U-test for non-normal/homoscedastic data. Comparison is not performed in case of non-normal/heteroscedastic data.

For paired samples: normality is test on the difference between the two levels of the categorical variable with Shapiro–Wilk test (sample size must be >= 3). Comparisons between the two levels of the categorical variable are performed with paired Student's t-test for normal data, or Wilcoxon signed-rank test for non-normal data.

Statistical significance is adjusted by false discovery rate (FDR), using the Benjamini-Hochberg procedure.

Value

A tibble containing, for each continuous variable, a statistical comparison between the two levels of the categorical variable and the corresponding plot.


benvallin/banban documentation built on Sept. 29, 2023, 5:46 a.m.