View source: R/summaryfactorlist.R
summary_factorlist | R Documentation |
A function that takes a single dependent variable with a vector of explanatory variable names (continuous or categorical variables) to produce a summary table.
summary_factorlist( .data, dependent = NULL, explanatory = NULL, formula = NULL, cont = "mean", cont_nonpara = NULL, cont_cut = 5, cont_range = TRUE, p = FALSE, p_cont_para = "aov", p_cat = "chisq", column = TRUE, total_col = FALSE, orderbytotal = FALSE, digits = c(1, 1, 3, 1, 0), na_include = FALSE, na_include_dependent = FALSE, na_complete_cases = FALSE, na_to_p = FALSE, na_to_prop = TRUE, fit_id = FALSE, add_dependent_label = FALSE, dependent_label_prefix = "Dependent: ", dependent_label_suffix = "", add_col_totals = FALSE, include_col_totals_percent = TRUE, col_totals_rowname = NULL, col_totals_prefix = "", add_row_totals = FALSE, include_row_totals_percent = TRUE, include_row_missing_col = TRUE, row_totals_colname = "Total N", row_missing_colname = "Missing N", catTest = NULL, weights = NULL )
.data |
Dataframe. |
dependent |
Character vector of length 1: name of dependent variable (2 to 5 factor levels). |
explanatory |
Character vector of any length: name(s) of explanatory variables. |
formula |
an object of class "formula" (or one that can be coerced to that class). Optional instead of standard dependent/explanatory format. Do not include if using dependent/explanatory. |
cont |
Summary for continuous explanatory variables: "mean" (standard deviation) or "median" (interquartile range). If "median" then non-parametric hypothesis test performed (see below). |
cont_nonpara |
Numeric vector of form e.g. |
cont_cut |
Numeric: number of unique values in continuous variable at which to consider it a factor. |
cont_range |
Logical. Median is show with 1st and 3rd quartiles. |
p |
Logical: Include null hypothesis statistical test. |
p_cont_para |
Character. Continuous variable parametric test. One of either "aov" (analysis of variance) or "t.test" for Welch two sample t-test. Note continuous non-parametric test is always Kruskal Wallis (kruskal.test) which in two-group setting is equivalent to Mann-Whitney U /Wilcoxon rank sum test. For continous dependent and continuous explanatory, the parametric test p-value returned is for the Pearson correlation coefficient. The non-parametric equivalent is for the p-value for the Spearman correlation coefficient. |
p_cat |
Character. Categorical variable test. One of either "chisq" or "fisher". |
column |
Logical: Compute margins by column rather than row. |
total_col |
Logical: include a total column summing across factor levels. |
orderbytotal |
Logical: order final table by total column high to low. |
digits |
Number of digits to round to (1) mean/median, (2) standard deviation / interquartile range, (3) p-value, (4) count percentage, (5) weighted count. |
na_include |
Logical: make explanatory variables missing data explicit
( |
na_include_dependent |
Logical: make dependent variable missing data explicit. |
na_complete_cases |
Logical: include only rows with complete data. |
na_to_p |
Logical: include missing as group in statistical test. |
na_to_prop |
Logical: include missing in calculation of column proportions. |
fit_id |
Logical: allows merging via |
add_dependent_label |
Add the name of the dependent label to the top left of table. |
dependent_label_prefix |
Add text before dependent label. |
dependent_label_suffix |
Add text after dependent label. |
add_col_totals |
Logical. Include column total n. |
include_col_totals_percent |
Include column percentage of total. |
col_totals_rowname |
Logical. Row name for column totals. |
col_totals_prefix |
Character. Prefix to column totals, e.g. "N=". |
add_row_totals |
Logical. Include row totals. Note this differs from
|
include_row_totals_percent |
Include row percentage of total. |
include_row_missing_col |
Logical. Include missing data total for each
row. Only used when |
row_totals_colname |
Character. Column name for row totals. |
row_missing_colname |
Character. Column name for missing data totals for each row. |
catTest |
Deprecated. See |
weights |
Character vector of length 1: name of column to use for weights. Explanatory continuous variables are multiplied by weights. Explanatory categorical variables are counted with a frequency weight (sum(weights)). |
This function aims to produce publication-ready summary tables for categorical or continuous dependent variables. It usually takes a categorical dependent variable to produce a cross table of counts and proportions expressed as percentages or summarised continuous explanatory variables. However, it will take a continuous dependent variable to produce mean (standard deviation) or median (interquartile range) for use with linear regression models.
Returns a factorlist
dataframe.
fit2df
ff_column_totals
ff_row_totals
ff_label
ff_glimpse
ff_percent_only
. For lots of examples, see https://finalfit.org/
library(finalfit) library(dplyr) # Load example dataset, modified version of survival::colon data(colon_s) # Table 1 - Patient demographics ---- explanatory = c("age", "age.factor", "sex.factor", "obstruct.factor") dependent = "perfor.factor" colon_s %>% summary_factorlist(dependent, explanatory, p=TRUE) # summary.factorlist() is also commonly used to summarise any number of # variables by an outcome variable (say dead yes/no). # Table 2 - 5 yr mortality ---- explanatory = c("age.factor", "sex.factor", "obstruct.factor", "perfor.factor") dependent = "mort_5yr" colon_s %>% summary_factorlist(dependent, explanatory)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.