Introduction to `ttt`: Formatted Tables the Easy Way"

set.seed(123)
library(ttt, quietly=TRUE)

Introduction

ttt stands for "The Table Tool" (or, if you prefer "Tables! Tables! Tables!"). It allows you to creates formatted HTML tables of in a flexible and convenient way.

Creating nice formatted tables has traditionally been a pain points with R. Over the years, however, things have gotten a lot better, with the emergence of packages that can produce nice looking tables in HTML (or LaTex or even Microsoft® Word), or which there are now a large number.^[ A non-exhaustive list includes: flextable, kableExtra, huxtable, htmlTable, tableHTML, ztable, formattable, pixiedust, basictabler, mmtable2, gt, DT, tables, xtable. ] But most of those packages treat tables just like data.frames, i.e. a grid of rows and columns, with very little structure. While some packages do have constructs that allow you to group columns or rows with additional headers and labels, that structure is basically superficial, tacked onto the data.frame after the fact. And while it may achieve the desired result it tends to require more code and by consequence, effort. Another thing that some of these packages do is they give you the flexibility to control the visual appearance or styling of the tables (fonts, colors, grid lines, spacing, etc.) directly from the R code. This is nice, but typically achieving that level of flexibility requires a pretty complex interface of functions and arguments dedicated to styling, and achieving the desired result can take a considerable amount of code and hence again, effort.

This package takes a different approach. It focuses on the table structure and content, leaving the formatting duties to CSS, a dedicated language that was designed specifically for this purpose (the downside is that it only works for HTML, but we accept this inconvenience). Also, the package follows the philosophy of not trying to solve all problems, but solving some problems well. Design decisions have been made to make some things easy, at the expense of limiting the package's generality (while still keeping in it some sense quite general, as this vignette will demonstrate). That it is not possible with this package to produce all conceivable tables is a given; that was never the intention.

Basic Examples

Before we start, let's load a couple of packages that we will be using:

library(table1, quietly=TRUE)
library(magrittr, quietly=TRUE)

It is worth taking a minute to comment on these packages. The first, table1, is like a "sister" package of ttt (they are both written by the same author). While not a strict requirement, table1 contains some utility functions that can also be quite useful in conjunction with ttt, so most of the time it is a good idea to load table1 along with ttt. The magrittr package contains the well-known "pipe" operator that we will make use of at some point in this vignette, so we load that, too.

With that out of the way, let's start looking at what ttt can do, and how to use it. The types of tables that can be produced are many and varied. At its simplest, the ttt() function can turn a data.frame into an HTML table:

ttt(mtcars)

But this is far from the typical use case. More typically, there is some structure to the data, in the sense that some columns contain values, while others contain keys that are used to group values according to some common characteristic. We will refer to these latter columns as facets, borrowing a term from the ggplot2 package. We will assume that the data are in a "tidy" format, by which we mean that all the values have been placed in a single column (if this is not the case, there are many functions that will allow you to "reshape" the data accordingly).

For the second example, continuing with the mtcars data, we would like to tabulate the average mpg (miles per gallon) by number of gears (rows) and cylinders (columns). Here is the code:

ttt(mpg ~ gear | cyl, data=mtcars, lab="Number of Cylinders", render=mean)

Let's break this down. The first argument is a formula with 3 parts: <values> ~ <row facets> | <column facets>. The first part, to the left of the symbol ~, is the name of the column that contains the values (recall that in the "tidy" format there is only one such column). The second part, between the symbols ~ and | contains the row facets, one or more variables that define how the values should be split into rows. The third part, to the right of the symbol | is the column facets, one or more variables that define how the values should be split into columns.

Following the formula comes the data argument, which is the data.frame that contains the data that the formula refers to. The next argument lab is an optional label placed over all the columns. The last argument, render, is a function. This function is called for each grouping of data defined by unique combinations of the row and column facets, and produces the value that appears in the corresponding cell of the table. Hopefully this is fairly intuitive.

Now, the above table is nice, but still not quite what we want. Here are the issues that need to be addressed:

  1. The label "gear" should be changed to "Number of Gears".

  2. One table cell contains the cryptic value "NaN" because there aren't any cars with 8 cylinders and 4 gears in our dataset; we would like this cell to remain empty instead.

  3. The number of decimal digits is different in each cell; we would like it to be the same throughout the table (1 decimal digit).

Let us now address these issues.

label(mtcars$gear) <- "Number of<br/>Gears"

rndr <- function(x, ...) {
    if (length(x) == 0) return("")
    round_pad(mean(x), 1)
}

ttt(mpg ~ gear | cyl, data=mtcars, lab="Number of Cylinders", render=rndr)

The way we addressed the first issue was to add a label to the variable gear using the label() function (one of the useful utility functions from the table1 package). The two other issues were fixed by defining a function rndr() to do the rendering.

The ttt() function allows the order of the formula data data arguments to be switched, so that an alternative syntax using the magrittr "pipe" operator may be used:

mtcars %>% ttt(mpg ~ gear | cyl, lab="Number of Cylinders", render=rndr)

Facets

The facets allow you to "slide-&-dice" the data however you want. The column facet is optional; it can be omitted:

ttt(mpg ~ gear, data=mtcars, render=rndr)

The row facet is required by the formula syntax, but the "magic" value 1 may be used to indicate no splitting by rows:

ttt(mpg ~ 1 | cyl, data=mtcars, lab="Number of Cylinders", render=rndr)

Both row and column facets may consist of more than one variable, joined together by the symbol +. Here is an example with 2 row facets:

label(mtcars$cyl) <- "Number of<br/>Cylinders"
ttt(mpg ~ gear + cyl, data=mtcars, render=rndr)

And, similarly for 2 column facets:

ttt(mpg ~ 1 | gear + cyl, data=mtcars, lab="Number of Cylinders/Gears", render=rndr)

The order of the variables is obviously important, as they define a nesting structure. If we think of the | that separates the row and column facets as the "values" (i.e. the central part of the table), then the order makes sense: variables closer to the center (|) are grouped within variables that are farther away.

Just to demonstrate, here is a synthetic example of a large table where both row and column facets are nested 3 levels deep:

bigtable <- expand.grid(
    R1=LETTERS[1:3],
    R2=LETTERS[4:6],
    R3=LETTERS[7:9],
    C1=LETTERS[10:12],
    C2=LETTERS[13:15],
    C3=LETTERS[16:18])

bigtable$x <- 1:nrow(bigtable)
ttt(x ~ R3 + R2 + R1 | C1 + C2 + C3, data=bigtable)

Render functions

The render function gives a lot of flexibility. For example, instead of the mean mpg, we can list the cars according to their gear/cylinder combination:

ttt(rownames(mtcars) ~ gear | cyl, data=mtcars, lab="Number of Cylinders",
  render=paste, collapse="<br/>")

Note that additional arguments can be passed to the render function through ... (as in this case, collapse).

Furthermore, a render function can return more than one value in a named vector. In this case, the argument expand.along also comes into play. It determines how the multiple values are laid out, either vertically (along rows), or the horizontally (along columns).

For example, here we define a function that computes both the mean and standard deviation (SD), each to 3 significant digits, and apply it to the response variable in the OrchardSprays dataset (i.e., decrease) according to treatment:

rndr.meansd <- function(x) signif_pad(c(Mean=mean(x), SD=sd(x)), digits=3)

ttt(decrease ~ treatment, data=OrchardSprays, render=rndr.meansd)

The default is expand.along="rows", which produces the result above. As we can see, the table contains an additional column with the values "Mean" and "SD", and each value is displayed in its own row. By default, the extra column is labeled "Statistic", but to change the label to "Blah" we can specify a named vector as follows:

ttt(decrease ~ treatment, data=OrchardSprays, render=rndr.meansd, expand.along=c(Blah="rows"))

The other option is expand.along="columns", which produces this result:

ttt(decrease ~ treatment, data=OrchardSprays, render=rndr.meansd, expand.along="columns")

Now "Mean" and "SD" each have their own column.

Captions and footnotes

A caption and one or more footnotes can be added to the table by specifying string values to the caption and footnote arguments, respectively:

ttt(decrease ~ 1 | treatment, data=OrchardSprays, render=rndr.meansd, lab="Treatment",
    caption="Mean and SD of Decrease by Treatment",
    footnote=c("Data: OrchardSprays", "Comment: <code>ttt</code> is cool!"))

There's not much more to say about this.

Styling

The appearance of the tables produced by ttt can be changed in 2 ways: using themes, or custom styling. Themes are easier, but don't give much flexibility. For fine-level control, custom styling must be used.

Themes

The ttt package comes with 2 themes at this time: the default theme that has been used throughout this vignette so far, and the booktabs theme. (More themes may be added later.) Selecting the theme can be done using the ttt.theme global option:

options(ttt.theme="booktabs")  # Select the "booktabs" theme

If we select the booktabs theme, our large table looks like this:

ttt(x ~ R3 + R2 + R1 | C1 + C2 + C3, data=bigtable)
css <- readLines(system.file(package="ttt", "ttt_booktabs_1.0/ttt_booktabs.css"))
css <- gsub(".Rttt ", ".Rttt-booktabs-demo ", css, fixed=TRUE)
ttt(x ~ R3 + R2 + R1 | C1 + C2 + C3, data=bigtable, topclass="Rttt-booktabs-demo", css=css)

Note that a theme can only apply to a whole document; it is not possible to selectively style different tables within the same document differently using different themes, as we appear to have done here (but it can be done with custom styling, which is how it was done).

options(ttt.theme="default")  # Change back to the "default" theme

Custom styling

As mentioned in the introduction, changing the table's appearance is accomplished using CSS. In order to make this possible, ttt places "hooks" in the table in the form of class attributes on various HTML elements.

The first thing to know is that the whole table is enclosed in a <div class="Rttt"> element. The allows specific formatting to be applied to tables output by ttt without interfering with other tables in the same document.

The next thing is that all row labels have the class Rttt-rl, and all column labels have the class Rttt-cl. Furthermore, there are different classes for each level or nesting: Rttt-rl-lvl1 for the first (i.e. innermost) level or row labels, Rttt-rl-lvl2 for the second level, and so on, and similarly for the column labels with cl instead of rl.

Finally, it is possible to add a class attribute or id attribute to the whole table, so that it may be targetted with specific CSS selectors. We can also pass CSS code directly to the ttt() function to be included with the table.

For example, can can specify that our table has the ID bigtable, and then give it a particular (and particularly weird) style:

css <- '
#bigtable {
  font-family: "Lucida Console", Monaco, monospace;
}
#bigtable td {
  background-color: #eee;
}
#bigtable th {
  color: blue;
  background-color: lightblue;
}
#bigtable th, #bigtable td {
  border: 2px dashed orange;
}
#bigtable .Rttt-rl {
  background-color: #fff;
  font-style: italic;
  font-weight: bold;
}
#bigtable .Rttt-rl-lvl1 {
  font-size: 12pt;
  color: pink;
  background-color: yellow;
}
#bigtable .Rttt-rl-lvl2 {
  font-size: 14pt;
  color: green;
}
#bigtable .Rttt-rl-lvl3 {
  font-size: 18pt;
  color: red;
}
'

ttt(x ~ R3 + R2 + R1 | C1 + C2 + C3, data=bigtable, id="bigtable", css=css)

Example: conditional formatting

A render function can actually add a html.class attribute to its return value. The value of this attribute will be assigned to the resulting HTML element's class attribute, allowing you to target that element with specific formatting.

For example, suppose we have a data.frame that contains some numeric values, and we want to put them in a table:

dat <- expand.grid(row=LETTERS[1:5], column=LETTERS[1:5])
dat$value <- rnorm(nrow(dat))

ttt(value ~ row | column, data=dat, render=round_pad, digits=2)

Furthermore, suppose we want the cells that contain negative values to be red, and those that contain positive values to be green. (Note: this can actually be easily accomplished using JavaScript, but that's not the point of this example). Let's define a render function for this:

rndr <- function(x, ...) {
    y <- round_pad(x, 2)
    attr(y, "html.class") <- ifelse(x < 0, "neg", "pos")
    y
}

(since there are no zeros in this example, the code here cheats and treats zero as positive; if you don't like this, think of it as non-negative).

The render function sets the desired class on the elements. Now, we can add some CSS code to obtain the desired colors:

```{css, echo=TRUE} .neg { color: #990000; background-color: #ff000030; } .pos { color: #007700; background-color: #00ff0030; }

Finally, we generate the table:

```r
ttt(value ~ row | column, data=dat, render=rndr)


Try the ttt package in your browser

Any scripts or data that you put into this service are public.

ttt documentation built on May 7, 2021, 5:06 p.m.