mean_tbl: Summarize continuous variables

View source: R/mean_tbl.R

mean_tblR Documentation

Summarize continuous variables

Description

mean_tbl() calculates summary statistics (i.e., mean, standard deviation, minimum, maximum, and count of non-missing values) for continuous (i.e., interval and ratio-level) variables.

Usage

mean_tbl(
  data,
  var_stem,
  var_input = "stem",
  regex_stem = FALSE,
  ignore_stem_case = FALSE,
  na_removal = "listwise",
  only = NULL,
  var_labels = NULL,
  ignore = NULL
)

Arguments

data

A data frame.

var_stem

A character vector with one or more elements, where each represents either a variable stem or the complete name of a variable present in data. A variable 'stem' refers to a common naming pattern shared among related variables, typically reflecting repeated measures of the same idea or a group of items assessing a single concept.

var_input

A character string specifying whether the values supplied to var_stem should be treated as variable stems (stem) or as complete variable names (name). By default, this is set to stem, so the function searches for variables that begin with each stem provided. Setting this argument to name directs the function to look for variables that exactly match the provided names.

regex_stem

A logical value indicating whether to use Perl-compatible regular expressions when searching for variable stems. Default is FALSE.

ignore_stem_case

A logical value indicating whether the search for columns matching the supplied var_stem is case-insensitive. Default is FALSE.

na_removal

A character string that specifies the method for handling missing values: pairwise or listwise. Defaults to listwise.

only

A character string or vector of character strings specifying which summary statistics to return. Defaults to NULL, which includes mean (mean), standard deviation (sd), minimum (min), maximum (max), and count of non-missing values (nobs).

var_labels

An optional named character vector or list used to assign custom labels to variable names. Each element must be named and correspond to a variable included in the returned table. If var_input is set to stem, and any element is either unnamed or refers to a variable not present in the table, all labels will be ignored and the table will be printed without them.

ignore

An optional named vector or list indicating values to exclude from variables matching specified stems (or names). Defaults to NULL, indicating that all values are retained. To specify exclusions for variables identified by var_stem, use the corresponding stems or variable names as names in the vector or list. To exclude multiple values from these variables, supply them as a named list.

Value

A tibble showing summary statistics for continuous variables.

Author(s)

Ama Nyame-Mensah

Examples

sdoh_child_ages <- 
  dplyr::select(sdoh, c(ACS_PCT_AGE_0_4, ACS_PCT_AGE_5_9,
                        ACS_PCT_AGE_10_14, ACS_PCT_AGE_15_17))

mean_tbl(data = sdoh_child_ages, var_stem = "ACS_PCT_AGE")

mean_tbl(data = sdoh_child_ages,
         var_stem = "ACS_PCT_AGE",
         na_removal = "pairwise",
         var_labels = c(
           ACS_PCT_AGE_0_4 = "% of population between ages 0-4",
           ACS_PCT_AGE_5_9 = "% of population between ages 5-9",
           ACS_PCT_AGE_10_14 = "% of population between ages 10-14",
           ACS_PCT_AGE_15_17 = "% of population between ages 15-17"))
                        

summarytabl documentation built on Nov. 6, 2025, 5:07 p.m.