tsummary: Summary of All Numeric Variables in a Data Frame

tsummaryR Documentation

Summary of All Numeric Variables in a Data Frame

Description

Calculate descriptive summary statistics of all numeric variables in a given dataset. Optionally, this output can be stratified by one or more categorical variable(s).

Usage

tsummary(data, ..., by = NULL, na.rm = TRUE)

Arguments

data

Data frame (tibble).

...

Optional. Variables to summarize. If not provided, all numeric variables will be summarized. Supports tidy evaluation; see examples.

by

Optional. Categorical variable(s) to stratify results by.

na.rm

Optional. Drop missing values from summary statatistics? If set to FALSE, summary statistics may be missing in the presence of missing values. Defaults to TRUE.

Value

Tibble, possibly grouped, with the following columns:

  • rows Row count

  • obs Count of non-missing observations

  • distin Count of distinct values

  • min Minimum value

  • q25 25th percentile

  • median Median, 50th percentile

  • q75 75th percentile

  • max Maximum value

  • mean Mean

  • sd Standard deviation

  • sum Sum of all values

Examples

data(mtcars)
mtcars %>%
  tsummary()

# Select specific variables and
# remove some summary statistics:
mtcars %>%
  tsummary(mpg, cyl, hp, am, gear, carb) %>%
  dplyr::select(-mean, -sd, -sum)

# Stratify by 'gear':
mtcars %>%
  tsummary(mpg, hp, carb, by = gear)

# Stratify by 'gear' and 'am':
mtcars %>%
  tsummary(mpg, hp, carb, by = c(am, gear))

stopsack/khsmisc documentation built on Sept. 22, 2023, 12:26 p.m.