set.seed(123) library(ttt, quietly=TRUE)
ttt
stands for "The Table Tool" (or, if you prefer "Tables! Tables!
Tables!"). It allows you to creates formatted HTML tables of in a flexible and
convenient way.
Creating nice formatted tables has traditionally been a pain points with R.
Over the years, however, things have gotten a lot better, with the emergence of
packages that can produce nice looking tables in HTML (or LaTex or even
Microsoft® Word), or which there are now a large number.^[
A non-exhaustive list includes:
flextable,
kableExtra,
huxtable,
htmlTable,
tableHTML,
ztable,
formattable,
pixiedust,
basictabler,
mmtable2,
gt,
DT,
tables,
xtable.
]
But most of those packages treat tables just like data.frames
, i.e. a grid of
rows and columns, with very little structure. While some packages do have
constructs that allow you to group columns or rows with additional headers and
labels, that structure is basically superficial, tacked onto the data.frame
after the fact. And while it may achieve the desired result it tends to require
more code and by consequence, effort. Another thing that some of these
packages do is they give you the flexibility to control the visual appearance
or styling of the tables (fonts, colors, grid lines, spacing, etc.) directly
from the R code. This is nice, but typically achieving that level of
flexibility requires a pretty complex interface of functions and arguments
dedicated to styling, and achieving the desired result can take a considerable
amount of code and hence again, effort.
This package takes a different approach. It focuses on the table structure and content, leaving the formatting duties to CSS, a dedicated language that was designed specifically for this purpose (the downside is that it only works for HTML, but we accept this inconvenience). Also, the package follows the philosophy of not trying to solve all problems, but solving some problems well. Design decisions have been made to make some things easy, at the expense of limiting the package's generality (while still keeping in it some sense quite general, as this vignette will demonstrate). That it is not possible with this package to produce all conceivable tables is a given; that was never the intention.
Before we start, let's load a couple of packages that we will be using:
library(table1, quietly=TRUE) library(magrittr, quietly=TRUE)
It is worth taking a minute to comment on these packages. The first, table1
,
is like a "sister" package of ttt
(they are both written by the same author).
While not a strict requirement, table1
contains some utility functions that
can also be quite useful in conjunction with ttt
, so most of the time it is
a good idea to load table1
along with ttt
. The magrittr
package
contains the well-known "pipe" operator that we will make use of at some point
in this vignette, so we load that, too.
With that out of the way, let's start looking at what ttt
can do, and how to
use it. The types of tables that can be produced are many and varied. At its
simplest, the ttt()
function can turn a data.frame
into an HTML table:
ttt(mtcars)
But this is far from the typical use case. More typically, there is some
structure to the data, in the sense that some columns contain values, while
others contain keys that are used to group values according to some common
characteristic. We will refer to these latter columns as facets, borrowing a
term from the ggplot2
package. We will assume that the data are in a "tidy"
format, by which we mean that all the values have been placed in a single
column (if this is not the case, there are many functions that will allow you
to "reshape" the data accordingly).
For the second example, continuing with the mtcars
data, we would like to
tabulate the average mpg
(miles per gallon) by number of gears (rows) and
cylinders (columns). Here is the code:
ttt(mpg ~ gear | cyl, data=mtcars, lab="Number of Cylinders", render=mean)
Let's break this down. The first argument is a formula
with 3 parts:
<values> ~ <row facets> | <column facets>
. The first part, to the left of the
symbol ~
, is the name of the column that contains the values (recall that in
the "tidy" format there is only one such column). The second part, between the
symbols ~
and |
contains the row facets, one or more variables that
define how the values should be split into rows. The third part, to the right
of the symbol |
is the column facets, one or more variables that define how
the values should be split into columns.
Following the formula
comes the data
argument, which is the data.frame
that contains the data that the formula
refers to. The next argument lab
is an optional label placed over all the columns. The last argument, render
,
is a function
. This function is called for each grouping of data defined by
unique combinations of the row and column facets, and produces the value that
appears in the corresponding cell of the table. Hopefully this is fairly
intuitive.
Now, the above table is nice, but still not quite what we want. Here are the issues that need to be addressed:
The label "gear" should be changed to "Number of Gears".
One table cell contains the cryptic value "NaN" because there aren't any cars with 8 cylinders and 4 gears in our dataset; we would like this cell to remain empty instead.
The number of decimal digits is different in each cell; we would like it to be the same throughout the table (1 decimal digit).
Let us now address these issues.
label(mtcars$gear) <- "Number of<br/>Gears" rndr <- function(x, ...) { if (length(x) == 0) return("") round_pad(mean(x), 1) } ttt(mpg ~ gear | cyl, data=mtcars, lab="Number of Cylinders", render=rndr)
The way we addressed the first issue was to add a label to the variable gear
using the label()
function (one of the useful utility functions from the
table1
package). The two other issues were fixed by defining a function
rndr()
to do the rendering.
The ttt()
function allows the order of the formula
data data
arguments to
be switched, so that an alternative syntax using the magrittr
"pipe" operator
may be used:
mtcars %>% ttt(mpg ~ gear | cyl, lab="Number of Cylinders", render=rndr)
The facets allow you to "slide-&-dice" the data however you want. The column facet is optional; it can be omitted:
ttt(mpg ~ gear, data=mtcars, render=rndr)
The row facet is required by the formula syntax, but the "magic" value 1
may
be used to indicate no splitting by rows:
ttt(mpg ~ 1 | cyl, data=mtcars, lab="Number of Cylinders", render=rndr)
Both row and column facets may consist of more than one variable, joined
together by the symbol +
. Here is an example with 2 row facets:
label(mtcars$cyl) <- "Number of<br/>Cylinders" ttt(mpg ~ gear + cyl, data=mtcars, render=rndr)
And, similarly for 2 column facets:
ttt(mpg ~ 1 | gear + cyl, data=mtcars, lab="Number of Cylinders/Gears", render=rndr)
The order of the variables is obviously important, as they define a nesting
structure. If we think of the |
that separates the row and column facets as
the "values" (i.e. the central part of the table), then the order makes sense:
variables closer to the center (|
) are grouped within variables that are
farther away.
Just to demonstrate, here is a synthetic example of a large table where both row and column facets are nested 3 levels deep:
bigtable <- expand.grid( R1=LETTERS[1:3], R2=LETTERS[4:6], R3=LETTERS[7:9], C1=LETTERS[10:12], C2=LETTERS[13:15], C3=LETTERS[16:18]) bigtable$x <- 1:nrow(bigtable) ttt(x ~ R3 + R2 + R1 | C1 + C2 + C3, data=bigtable)
The render
function gives a lot of flexibility. For example, instead of the
mean mpg
, we can list the cars according to their gear/cylinder combination:
ttt(rownames(mtcars) ~ gear | cyl, data=mtcars, lab="Number of Cylinders", render=paste, collapse="<br/>")
Note that additional arguments can be passed to the render
function through ...
(as in this case, collapse
).
Furthermore, a render
function can return more than one value in a named
vector. In this case, the argument expand.along
also comes into play. It
determines how the multiple values are laid out, either vertically (along
rows), or the horizontally (along columns).
For example, here we define a function that computes both the mean and
standard deviation (SD), each to 3 significant digits, and apply it to the response
variable in the OrchardSprays
dataset (i.e., decrease
) according to treatment:
rndr.meansd <- function(x) signif_pad(c(Mean=mean(x), SD=sd(x)), digits=3) ttt(decrease ~ treatment, data=OrchardSprays, render=rndr.meansd)
The default is expand.along="rows"
, which produces the result above. As we
can see, the table contains an additional column with the values "Mean" and
"SD", and each value is displayed in its own row. By default, the extra column
is labeled "Statistic", but to change the label to "Blah" we can specify a
named vector as follows:
ttt(decrease ~ treatment, data=OrchardSprays, render=rndr.meansd, expand.along=c(Blah="rows"))
The other option is expand.along="columns"
, which produces this result:
ttt(decrease ~ treatment, data=OrchardSprays, render=rndr.meansd, expand.along="columns")
Now "Mean" and "SD" each have their own column.
A caption and one or more footnotes can be added to the table by specifying string values
to the caption
and footnote
arguments, respectively:
ttt(decrease ~ 1 | treatment, data=OrchardSprays, render=rndr.meansd, lab="Treatment", caption="Mean and SD of Decrease by Treatment", footnote=c("Data: OrchardSprays", "Comment: <code>ttt</code> is cool!"))
There's not much more to say about this.
The appearance of the tables produced by ttt
can be changed in 2 ways: using
themes, or custom styling. Themes are easier, but don't give much flexibility.
For fine-level control, custom styling must be used.
The ttt
package comes with 2 themes at this time: the default
theme that
has been used throughout this vignette so far, and the booktabs
theme. (More
themes may be added later.) Selecting the theme can be done using the
ttt.theme
global option:
options(ttt.theme="booktabs") # Select the "booktabs" theme
If we select the booktabs
theme, our large table looks like this:
ttt(x ~ R3 + R2 + R1 | C1 + C2 + C3, data=bigtable)
css <- readLines(system.file(package="ttt", "ttt_booktabs_1.0/ttt_booktabs.css")) css <- gsub(".Rttt ", ".Rttt-booktabs-demo ", css, fixed=TRUE) ttt(x ~ R3 + R2 + R1 | C1 + C2 + C3, data=bigtable, topclass="Rttt-booktabs-demo", css=css)
Note that a theme can only apply to a whole document; it is not possible to selectively style different tables within the same document differently using different themes, as we appear to have done here (but it can be done with custom styling, which is how it was done).
options(ttt.theme="default") # Change back to the "default" theme
As mentioned in the introduction, changing the table's appearance is
accomplished using CSS. In order to make this possible, ttt
places "hooks" in
the table in the form of class
attributes on various HTML elements.
The first thing to know is that the whole table is enclosed in a
<div class="Rttt">
element. The allows specific formatting to be applied to tables
output by ttt
without interfering with other tables in the same document.
The next thing is that all row labels have the class Rttt-rl
, and all column
labels have the class Rttt-cl
. Furthermore, there are different classes for
each level or nesting: Rttt-rl-lvl1
for the first (i.e. innermost) level or
row labels, Rttt-rl-lvl2
for the second level, and so on, and similarly for
the column labels with cl
instead of rl
.
Finally, it is possible to add a class
attribute or id
attribute to the
whole table, so that it may be targetted with specific CSS selectors. We can
also pass CSS code directly to the ttt()
function to be included with the
table.
For example, can can specify that our table has the ID bigtable
, and then
give it a particular (and particularly weird) style:
css <- ' #bigtable { font-family: "Lucida Console", Monaco, monospace; } #bigtable td { background-color: #eee; } #bigtable th { color: blue; background-color: lightblue; } #bigtable th, #bigtable td { border: 2px dashed orange; } #bigtable .Rttt-rl { background-color: #fff; font-style: italic; font-weight: bold; } #bigtable .Rttt-rl-lvl1 { font-size: 12pt; color: pink; background-color: yellow; } #bigtable .Rttt-rl-lvl2 { font-size: 14pt; color: green; } #bigtable .Rttt-rl-lvl3 { font-size: 18pt; color: red; } ' ttt(x ~ R3 + R2 + R1 | C1 + C2 + C3, data=bigtable, id="bigtable", css=css)
A render
function can actually add a html.class
attribute to its return
value. The value of this attribute will be assigned to the resulting HTML
element's class
attribute, allowing you to target that element with specific
formatting.
For example, suppose we have a data.frame
that contains some numeric values,
and we want to put them in a table:
dat <- expand.grid(row=LETTERS[1:5], column=LETTERS[1:5]) dat$value <- rnorm(nrow(dat)) ttt(value ~ row | column, data=dat, render=round_pad, digits=2)
Furthermore, suppose we want the cells that contain negative values to be red,
and those that contain positive values to be green. (Note: this can actually
be easily accomplished using JavaScript, but that's not the point of this
example). Let's define a render
function for this:
rndr <- function(x, ...) { y <- round_pad(x, 2) attr(y, "html.class") <- ifelse(x < 0, "neg", "pos") y }
(since there are no zeros in this example, the code here cheats and treats zero as positive; if you don't like this, think of it as non-negative).
The render
function sets the desired class on the elements. Now, we can add
some CSS code to obtain the desired colors:
```{css, echo=TRUE} .neg { color: #990000; background-color: #ff000030; } .pos { color: #007700; background-color: #00ff0030; }
Finally, we generate the table: ```r ttt(value ~ row | column, data=dat, render=rndr)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.