flex_table1 | R Documentation |
A convenience function, that provides and easy wrapper for the two main enginges of the function
table1
provides a nice API given a
formula to create demographics tables. I basically just advanced the
functionality of the p-value function to also be able to run for
multiple groups (ANOVA), added the possibility
to correct p-values with either Bonferroni or Sidark, and set some
sensible defaults to achieve a nice look
flextable
which gives all the power
to format the table as you please (e.g., conditional formatting
->adding bold for p values below .05), adding italic headers or notes
explaining what was done.
Really all credit should go to these two packages their developers. My function just provides an easy to use API or wrapper around their packages to get a beautiful publication ready bivariate comparison Table 1.
flex_table1( str_formula, data, correct = NA, num = NA, table_caption = NA, ref_correction = TRUE, include_teststat = TRUE, drop_unused_cats = TRUE, PCTexcludeNA = TRUE, overall = FALSE, ... )
str_formula |
A string representing a formula, e.g.,
|
data |
The dataset containing the variables for the table1 call (all terms from the str_formula must be present) |
correct |
Character, default = NA; NA for no correction. Currently available are "bonf" for Bonferroni correction or "sidark" for Sidark correction. If you want any other correction included just open an issue <https://github.com/Buedenbender/datscience/issues> or contact me via mail. Please see also the references and details on correction for multiple comparison |
num |
Integer number of comparisons. If NA will be determined automatically, by the number of terms in the formula |
table_caption |
Caption for the table, each element of the vector represents a new line. The first line will be bold face. All additional lines are in italic. |
ref_correction |
Boolean, default = TRUE, if TRUE corrected p-Values will be referenced in the foot note. |
include_teststat |
Boolean, default = TRUE, if TRUE includes two additional columns in the table. 1) Test statistic (either t, f or X²) and 2) degrees of Freedom |
drop_unused_cats |
Boolean, default = TRUE, if TRUE categories (i.e., factor levels) with 0 observations will be dropped. |
PCTexcludeNA |
Boolean, default = TRUE, Should calculation of percentages include or exclude Missings values. If PCTexcludeNA = TRUE, missings will be excluded. |
overall |
Character, default = FALSE, Should the final table also include a column for the totals of the sample? If a character is provided this give the name of the new column (recommendation "Overall") |
... |
(Optional), Additional arguments that can be passed to
|
On Fisher's Exact Test (FET) vs Pearson's χ²-test
Newest feature (as of 07/22), according to an excellent post on
cross-validated \insertCiteHarrell_cross_11datscience
the function refrains from using Fisher's exact test (FET) for
categorical variables and only applies FET in the the rare case of cells
with an expected cell frequencies do not exceed 1. This is due to the
fact, that the FET can be extreme resource intensive (and slow), and can
have type I error rates less than the nominal level
\insertCiteCrans2008datscience Contemporary evidence suggests, that
Pearson s χ²-test with the modification of \frac{N-1}{N}, nearly
allways is more accurate than FET and generally recommended
\insertCiteLydersen2009datscience. Thus in accordance we use the N-1
Pearson χ²-test proposed by (E.) Pearson and recommended as optimum test
policy by \insertCiteCampbell2007datscience.
On Multiple Comparisons
Let me start with a direct quote
"(..) researchers should not automatically (mindlessly)
assume that alpha adjustment is necessary during multiple testing."
\insertCiteRubin2021datscience
Whether, how and when to correct for multiple comparison in inferential
statistic, is still a an area of ongoing debate. However it was recently
argued that it is essential to differentiate between different forms of
multiple comparisons, to make the decision for or against a correction
\insertCiteRubin2021datscience. The types of multiple testing are:
disjunction testing
conjunction testing
individual testing
Correction is primarly adequate in case of disjunction testing.
Please refer to the very well written and laid out original publication
for more details. For the use case of this function, one can assume
a joint null hypotheses, being that Group A <...> Group N do not differ.
Now for example, if it is sufficient that the groups differ significantly
in one characteristic, this would be considered disjunction testing.
However, if we are only interested in the constituent (null-)hypotheses
(e.g., the groups differ in their highest level of education vs. they
differ in the current employment status), it could be categorized as
individual testing. Please chose considerately for your individual case.
However for the typical exploratory bivariate comparison in
sociodemographic table1, I deem it to be frequently cases of individual
testing, thus the flex_table1()
function defaults
to applying no correction.
A flextable
object with APA ready correlation
table. If a filepath is provided it also creates the respective file
(e.g., a word .docx file)
Bjoern Buedenbender
format_flextable
,
flextable
table1
## Not run: # Comparison of just two Groups str_formula <- "~ Sepal.Length + Sepal.Width +test | Species" data <- dplyr::filter(iris, Species %in% c("setosa", "versicolor")) data$test <- factor(rep(c("Female", "Male"), 50)) table_caption <- c("Table 1", "A test on the Iris Data") flex_table1(str_formula, data = data, table_caption = table_caption) # Comparison of Multiple Groups (ANOVA) str_formula <- "~ Sepal.Length + Sepal.Width + Gender_example | Species" data <- dplyr::filter(iris, Species %in% c("setosa", "versicolor")) data <- iris data$Gender_example <- factor(rep(c("Female", "Male"), nrow(data) / 2)) table_caption <- c("Table 1", "A test on the Iris Data") flex_table1(str_formula, data = data, table_caption = table_caption) ## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.