set_format_strings: Set the format strings and associated summaries to be...

View source: R/set_format_strings.R

set_format_stringsR Documentation

Set the format strings and associated summaries to be performed in a layer


'Tplyr' gives you extensive control over how strings are presented. set_format_strings allows you to apply these string formats to your layer. This behaves slightly differently between layers.


set_format_strings(e, ...)

## S3 method for class 'desc_layer'
set_format_strings(e, ..., cap = getOption("tplyr.precision_cap"))

## S3 method for class 'count_layer'
set_format_strings(e, ...)



Layer on which to bind format strings


Named parameters containing calls to f_str to set the format strings


A named character vector containing an 'int' element for the cap on integer precision, and a 'dec' element for the cap on decimal precision.


Format strings are one of the most powerful components of 'Tplyr'. Traditionally, converting numeric values into strings for presentation can consume a good deal of time. Values and decimals need to align between rows, rounding before trimming is sometimes forgotten - it can become a tedious mess that, in the grand scheme of things, is not an important part of the analysis being performed. 'Tplyr' makes this process as simple as we can, while still allowing flexibility to the user.

In a count layer, you can simply provide a single f_str object to specify how you want your n's, percentages, and denominators formatted. If you are additionally supplying a statistic, like risk difference using add_risk_diff, you specify the count formats using the name 'n_counts'. The risk difference formats would then be specified using the name 'riskdiff'. In a descriptive statistic layer, set_format_strings allows you to do a couple more things:

  • By naming parameters with character strings, those character strings become a row label in the resulting data frame

  • The actual summaries that are performed come from the variable names used within the f_str calls

  • Using multiple summaries (declared by your f_str calls), multiple summary values can appear within the same line. For example, to present "Mean (SD)" like displays.

  • Format strings in the desc layer also allow you to configure how empty values should be presented. In the f_str call, use the empty parameter to specify how missing values should present. A single element character vector should be provided. If the vector is unnamed, that value will be used in the format string and fill the space similar to how the numbers will display. Meaning - if your empty string is 'NA' and your format string is 'xx (xxx)', the empty values will populate as 'NA ( NA)'. If you name the character vector in the 'empty' parameter '.overall', like empty = c(.overall=''), then that exact string will fill the value instead. For example, providing 'NA' will instead create the formatted string as 'NA' exactly.

See the f_str documentation for more details about how this implementation works.


The layer environment with the format string binding added

tplyr_layer object with formats attached

Returns the modified layer object.


# Load in pipe

# In a count layer
tplyr_table(mtcars, gear) %>%
    group_count(cyl) %>%
      set_format_strings(f_str('xx (xx%)', n, pct))
  ) %>%

# In a descriptive statistics layer
tplyr_table(mtcars, gear) %>%
    group_desc(mpg) %>%
        "n"         = f_str("xx", n),
        "Mean (SD)" = f_str("xx.x", mean, empty='NA'),
        "SD"        = f_str("xx.xx", sd),
        "Median"    = f_str("xx.x", median),
        "Q1, Q3"    = f_str("xx, xx", q1, q3, empty=c(.overall='NA')),
        "Min, Max"  = f_str("xx, xx", min, max),
        "Missing"   = f_str("xx", missing)
  ) %>%

# In a shift layer
tplyr_table(mtcars, am) %>%
    group_shift(vars(row=gear, column=carb), by=cyl) %>%
    set_format_strings(f_str("xxx (xx.xx%)", n, pct))
  ) %>%

Tplyr documentation built on May 29, 2024, 10:37 a.m.