Tplyr: A grammar of summary data for clinical reports

TplyrR Documentation

A grammar of summary data for clinical reports


'r lifecycle::badge("experimental")'


'Tplyr' is a package dedicated to simplifying the data manipulation necessary to create clinical reports. Clinical data summaries can often be broken down into two factors - counting discrete variables (or counting shifts in state), and descriptive statistics around a continuous variable. Many of the reports that go into a clinical report are made up of these two scenarios. By abstracting this process away, 'Tplyr' allows you to rapidly build these tables without worrying about the underlying data manipulation.

'Tplyr' takes this process a few steps further by abstracting away most of the programming that goes into proper presentation, which is where a great deal of programming time is spent. For example, 'Tplyr' allows you to easily control:

String formatting

Different reports warrant different presentation of your strings. Programming this can get tedious, as you typically want to make sure that your decimals properly align. 'Tplyr' abstracts this process away and provides you with a simple interface to specify how you want your data presented

Treatment groups

Need a total column? Need to group summaries of multiple treatments? 'Tplyr' makes it simple to add additional treatment groups into your report


n (%) counts often vary based on the summary being performed. 'Tplyr' allows you to easily control what denominators are used based on a few common scenarios


Summarizing data is one thing, but ordering it for presentation. Tplyr automatically derives sorting variable to give you the data you need to order your table properly. This process is flexible so you can easily get what you want by leveraging your data or characteristics of R.

Another powerful aspect of 'Tplyr' are the objects themselves. 'Tplyr' does more than format your data. Metadata about your table is kept under the hood, and functions allow you to access information that you need. For example, 'Tplyr' allows you to calculate and access the raw numeric data of calculations as well, and easily pick out just the pieces of information that you need.

Lastly, 'Tplyr' was built to be flexible, yet intuitive. A common pitfall of building tools like this is over automation. By doing to much, you end up not doing enough. 'Tplyr' aims to hit the sweet spot in between. Additionally, we designed our function interfaces to be clean. Modifier functions offer you flexibility when you need it, but defaults can be set to keep the code concise. This allows you to quickly assemble your table, and easily make changes where necessary.


Maintainer: Mike Stackhouse (ORCID)


Other contributors:

See Also

Useful links:


# Load in pipe

# Use just the defaults
tplyr_table(mtcars, gear) %>%
    group_desc(mpg, by=cyl)
  ) %>%
    group_count(carb, by=cyl)
  ) %>%

# Customize and modify
tplyr_table(mtcars, gear) %>%
    group_desc(mpg, by=cyl) %>%
        "n"         = f_str("xx", n),
        "Mean (SD)" = f_str("a.a+1 (a.a+2)", mean, sd, empty='NA'),
        "Median"    = f_str("a.a+1", median),
        "Q1, Q3"    = f_str("a, a", q1, q3, empty=c(.overall='NA')),
        "Min, Max"  = f_str("a, a", min, max),
        "Missing"   = f_str("xx", missing)
  ) %>%
    group_count(carb, by=cyl) %>%
        c('5', '3'),
        c('4', '3')
      ) %>%
        n_counts = f_str('xx (xx%)', n, pct),
        riskdiff = f_str(' (,', dif, low, high)
      ) %>%
      set_order_count_method("bycount") %>%
      set_ordering_cols('4') %>%
  ) %>%

# A Shift Table
tplyr_table(mtcars, am) %>%
    group_shift(vars(row=gear, column=carb), by=cyl) %>%
    set_format_strings(f_str("xxx (xx.xx%)", n, pct))
  ) %>%

Tplyr documentation built on May 29, 2024, 10:37 a.m.