tsummary: Summary of All Numeric Variables in a Data Frame
In stopsack/khsmisc: Miscellaneous functions for epidemiology research

tsummary

R Documentation

Summary of All Numeric Variables in a Data Frame

Description

Calculate descriptive summary statistics of all numeric variables in a given dataset. Optionally, this output can be stratified by one or more categorical variable(s).

Usage

tsummary(data, ..., by = NULL, na.rm = TRUE)

Arguments

`data`	Data frame (tibble).
`...`	Optional. Variables to summarize. If not provided, all numeric variables will be summarized. Supports tidy evaluation; see examples.
`by`	Optional. Categorical variable(s) to stratify results by.
`na.rm`	Optional. Drop missing values from summary statatistics? If set to `FALSE`, summary statistics may be missing in the presence of missing values. Defaults to `TRUE`.

Value

Tibble, possibly grouped, with the following columns:

rows Row count
obs Count of non-missing observations
distin Count of distinct values
min Minimum value
q25 25th percentile
median Median, 50th percentile
q75 75th percentile
max Maximum value
mean Mean
sd Standard deviation
sum Sum of all values

Examples

data(mtcars)
mtcars %>%
  tsummary()

# Select specific variables and
# remove some summary statistics:
mtcars %>%
  tsummary(mpg, cyl, hp, am, gear, carb) %>%
  dplyr::select(-mean, -sd, -sum)

# Stratify by 'gear':
mtcars %>%
  tsummary(mpg, hp, carb, by = gear)

# Stratify by 'gear' and 'am':
mtcars %>%
  tsummary(mpg, hp, carb, by = c(am, gear))

stopsack/khsmisc documentation built on Sept. 22, 2023, 12:26 p.m.