tbl_summary | R Documentation |
The tbl_summary()
function calculates descriptive statistics for
continuous, categorical, and dichotomous variables.
Review the
tbl_summary vignette
for detailed examples.
tbl_summary(
data,
by = NULL,
label = NULL,
statistic = list(all_continuous() ~ "{median} ({p25}, {p75})", all_categorical() ~
"{n} ({p}%)"),
digits = NULL,
type = NULL,
value = NULL,
missing = c("ifany", "no", "always"),
missing_text = "Unknown",
missing_stat = "{N_miss}",
sort = all_categorical(FALSE) ~ "alphanumeric",
percent = c("column", "row", "cell"),
include = everything()
)
data |
( |
by |
( |
label |
( |
statistic |
( |
digits |
( |
type |
( |
value |
( |
missing , missing_text , missing_stat |
Arguments dictating how and if missing values are presented:
|
sort |
( |
percent |
( |
include |
( |
a gtsummary table of class "tbl_summary"
A table of class c('tbl_summary', 'gtsummary')
The statistic argument specifies the statistics presented in the table. The
input dictates the summary statistics presented in the table. For example,
statistic = list(age ~ "{mean} ({sd})")
would report the mean and
standard deviation for age; statistic = list(all_continuous() ~ "{mean} ({sd})")
would report the mean and standard deviation for all continuous variables.
The values are interpreted using glue::glue()
syntax:
a name that appears between curly brackets will be interpreted as a function
name and the formatted result of that function will be placed in the table.
For categorical variables, the following statistics are available to display:
{n}
(frequency), {N}
(denominator), {p}
(percent).
For continuous variables, any univariate function may be used.
The most commonly used functions are {median}
, {mean}
, {sd}
, {min}
,
and {max}
.
Additionally, {p##}
is available for percentiles, where ##
is an integer from 0 to 100.
For example, p25: quantile(probs=0.25, type=2)
.
When the summary type is "continuous2"
, pass a vector of statistics.
Each element of the vector will result in a separate row in the summary table.
For both categorical and continuous variables, statistics on the number of missing and non-missing observations and their proportions are available to display.
{N_obs}
total number of observations
{N_miss}
number of missing observations
{N_nonmiss}
number of non-missing observations
{p_miss}
percentage of observations missing
{p_nonmiss}
percentage of observations not missing
The digits argument specifies the the number of digits (or formatting function) statistics are rounded to.
The values passed can either be a single integer, a vector of integers, a
function, or a list of functions. If a single integer or function is passed,
it is recycled to the length of the number of statistics presented.
For example, if the statistic is "{mean} ({sd})"
, it is equivalent to
pass 1
, c(1, 1)
, label_style_number(digits=1)
, and
list(label_style_number(digits=1), label_style_number(digits=1))
.
Named lists are also accepted to change the default formatting for a single
statistic, e.g. list(sd = label_style_number(digits=1))
.
There are four summary types. Use the type
argument to change the default summary types.
"continuous"
summaries are shown on a single row. Most numeric
variables default to summary type continuous.
"continuous2"
summaries are shown on 2 or more rows
"categorical"
multi-line summaries of nominal data. Character variables,
factor variables, and numeric variables with fewer than 10 unique levels default to
type categorical. To change a numeric variable to continuous that
defaulted to categorical, use type = list(varname ~ "continuous")
"dichotomous"
categorical variables that are displayed on a single row,
rather than one row per level of the variable.
Variables coded as TRUE
/FALSE
, 0
/1
, or yes
/no
are assumed to be dichotomous,
and the TRUE
, 1
, and yes
rows are displayed.
Otherwise, the value to display must be specified in the value
argument, e.g. value = list(varname ~ "level to show")
Daniel D. Sjoberg
See tbl_summary vignette for detailed tutorial
See table gallery for additional examples
Review list, formula, and selector syntax used throughout gtsummary
# Example 1 ----------------------------------
trial |>
select(age, grade, response) |>
tbl_summary()
# Example 2 ----------------------------------
trial |>
select(age, grade, response, trt) |>
tbl_summary(
by = trt,
label = list(age = "Patient Age"),
statistic = list(all_continuous() ~ "{mean} ({sd})"),
digits = list(age = c(0, 1))
)
# Example 3 ----------------------------------
trial |>
select(age, marker) |>
tbl_summary(
type = all_continuous() ~ "continuous2",
statistic = all_continuous() ~ c("{median} ({p25}, {p75})", "{min}, {max}"),
missing = "no"
)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.