summarytools::st_css(main = TRUE, global = TRUE)
library(knitr) opts_chunk$set(comment = NA, prompt = FALSE, cache = FALSE, echo = TRUE, results = 'asis') library(summarytools) st_options(bootstrap.css = FALSE, # Already part of the theme so no need for it plain.ascii = FALSE, # One of the essential settings style = "rmarkdown", # Idem. dfSummary.silent = TRUE, # Suppresses messages about temporary files footnote = NA, # Keeping the results minimalist descr.silent = TRUE, # To avoid messages when building / checking subtitle.emphasis = FALSE) # For the vignette theme, this gives better results. # For other themes, using TRUE might be preferable.
This document mainly contains examples showing how best to use
summarytools in R Markdown documents. For a more in-depth view
of the package's features, please see vignette("introduction", "summarytools")
- the online version can be found
here.
Every time we display summarytools objects with print()
, view()
,
or stview()
, we pick -- explicitly or not -- one of several display methods.
Possible display methods are: pander, render, viewer, and browser. It
is one of the parameters for print.summarytools()
and view()
(alias: stview()
).
Since methods viewer and browser are mostly meant for interactive work and rely on the same underlying code as render, we will assume for the purpose of this document that there are really only two methods: pander and render.
The pander method is used by default when results are automatically printed
to the console, or when we use print()
without an explicit method
argument.
The style parameter is communicated to pander (see ?pander::pander
or visit its GitHub page to learn more
on this very useful package).
 | When we use any of the *viewer*, *browser*, or *render* methods, the package uses **htmltools** to generate results; any specified *styles* are thus ignored. |
Available styles are the ones supported by pander:
dfSummary()
) dfSummary()
) dfSummary()
if you use ascii graphs only) Always set results='asis' either explicitly on a chunk-by-chunk basis
or by including opts_chunk$set(results = 'asis')
in your setup chunk.
Also, don't forget to specify plain.ascii = FALSE
in all function calls
using the pander method. It is advised to set this option, as well as the
style
option in the setup chunk:
st_options(plain.ascii = FALSE, style = "rmarkdown")
 | If you get repeated, unhelpful warnings, use chunk options `message = FALSE` and/or `warning = FALSE`. Another option is to use the argument `silent = TRUE` to the `print()` method or `view()` / `stview()` functions. See `?st_options` to set this globally for individual functions. |
The following table indicates which method / style is better suited for each summarytools function in the context of R Markdown documents:
| Function | render method | pander method | pander style | |:------------|:-------------:|:-------------:|:-------------| | freq() | ✓ | ✓ | rmarkdown | | ctable() | ✓ | Sub-optimal | rmarkdown | | descr() | ✓ | ✓ | rmarkdown | | dfSummary() | ✓ | ✓ | grid |
Recommended Style When Using pander method
For freq()
, descr()
, and ctable()
, rmarkdown style is recommended.
For dfSummary()
, grid is recommended. Note that multiline can also
be used, but only ascii graphs will be displayed.
Starting with freq()
, we'll now review the recommended methods and styles to
get satisfying results in R Markdown documents.
freq()
is best used with method "pander" (default), style = "rmarkdown"
;
html rendering is also possible.
With method = "pander"
, style = "rmarkdown"
is the easy winner. Since
"pander" is the default method, you can usually omit the call to
print()
. But to make things as clear as possible, we'll include it here.
print(freq(tobacco$gender, plain.ascii = FALSE, style = "rmarkdown"), method = "pander")
There are rarely any problems when using the render method to display
freq()
results.
print(freq(tobacco$gender), method = "render")
If you find the table is too large, you can use table.classes = "st-small"
:
print(descr(tobacco), method = "render", table.classes = "st-small")
Tables with multi-row headings are not fully supported in markdown (yet), but the result is close to acceptable. This, however, is not true for all themes. That is why the rendering method is preferred.
ctable(tobacco$gender, tobacco$smoker, plain.ascii = FALSE, style = "rmarkdown")
For best results, use this method.
print(ctable(tobacco$gender, tobacco$smoker), method = "render")
descr()
gives good results with both style = "rmarkdown"
and html
rendering.
descr(tobacco, plain.ascii = FALSE, style = "rmarkdown")
We'll use table.classes = "st-small"
to show how it affects the table's size,
compared to the freq()
table rendered earlier.
We'll also use message = FALSE
as chunk option to avoid the message saying
that non-numerical variables have been ignored.
print(descr(tobacco), method = "render", table.classes = "st-small")
To get optimal results, whichever method you choose, it is always best to
omit at least 1, and if possible 2 columns from the output. Also, pick
carefully the value of the graph.magnif
parameter.
Don't forget to specify plain.ascii = FALSE
(or set it as a global
option with st_options(plain.ascii = FALSE)
), or you won't get good results.
(Note: The following output is an image (screenshot). This is because CRAN doesn't allow writing in "/tmp" or any directory other than R's temp directory, which would pose problems in terms of column widths. The introductory vignette explains this issue in more details.)
dfSummary(tobacco, plain.ascii = FALSE, style = "grid", graph.magnif = 0.85, varnumbers = FALSE, valid.col = FALSE, tmp.img.dir = "/tmp")
This method works really well, and not having to specify the tmp.img.dir
parameter is a plus.
print(dfSummary(tobacco, varnumbers = FALSE, valid.col = FALSE, graph.magnif = 0.85), method = "render")
For data frames containing numerous variables, we can use the max.tbl.height
argument to wrap the results in a scrollable window having the specified
height, in pixels.
print(dfSummary(tobacco, varnumbers = FALSE, valid.col = FALSE, graph.magnif = 0.85), max.tbl.height = 300, method = "render")
 | Some users reported getting repeated X11 warnings; those can easily be avoided by using the following chunk expression: `{r, results="asis", warning=FALSE}`. |
As explained in the introductory vignette, tb()
can be used to convert
summarytools objects created with freq()
and descr()
to simple
tibbles, which packages specialized in table formatting will be able
to process. This is particularly helpful with stby
objects:
library(kableExtra) library(magrittr) stby(iris, iris$Species, descr, stats = "fivenum") %>% tb() %>% kable(format = "html", digits = 2) %>% collapse_rows(columns = 1, valign = "top")
Using tb(order = 3)
flips the order of the grouping variable(s) and the
reported variable(s):
stby(iris, iris$Species, descr, stats = "fivenum") %>% tb(order = 3) %>% kable(format = "html", digits = 2) %>% collapse_rows(columns = 1, valign = "top")
Here is a recipe for including fully formatted data frame summaries in pdf documents. There is some work involved, but carefully following the instructions given here should give the expected results.
There are basically two parts to this: first, you must create a preamble tex file. Second, you must indicate in the YAML section of your document where to find this file.
This is the \LaTeX content that needs to be included as preamble. You can either copy this into your own tex file, or use the file that is now included in summarytools (as of version 1.0), following the instructions provided below.
\usepackage{graphicx} \usepackage[export]{adjustbox} \usepackage{letltxmacro} \LetLtxMacro{\OldIncludegraphics}{\includegraphics} \renewcommand{\includegraphics}[2][]{\raisebox{0.5\height}% {\OldIncludegraphics[valign=t,#1]{#2}}}
If you choose to create a tex file from the above content, the name of the file is arbitrary -- you can use whatever name you want. Its location is also up to you. I suggest you put it in the same location as your Rmd file.
Along with the graph.magnif
parameter for dfSummary()
, you might need to
adjust the 0.5
value used as raisebox
parameter in the preamble.
Your document should start with a YAML header like this one:
--- title: "My PDF With Data Frame Summaries" output: pdf_document: latex_engine: xelatex includes: in_header: - !expr system.file("includes/fig-valign.tex", package = "summarytools") ---
If you need to customize the content of the preamble, then your header will look something like this (assuming it is in the same directory as your Rmd document):
--- title: "My PDF With Data Frame Summaries" output: pdf_document: latex_engine: xelatex includes: in_header: fig-valign-modified.tex ---
 | The *xelatex* engine option is not mandatory, but there are several advantages to it. I use it systematically and recommend you do the same. |
Here is an example setup chunk:
```r`r ''` library(summarytools) st_options( plain.ascii = FALSE, style = "rmarkdown", dfSummary.style = "grid", dfSummary.valid.col = FALSE, dfSummary.graph.magnif = .52, subtitle.emphasis = FALSE, tmp.img.dir = "/tmp" ) ```
And here is a chunk actually creating the summary:
```r`r ''` define_keywords(title.dfSummary = "Data Frame Summary in PDF Format") dfSummary(tobacco) ```
Since we redefined the $\LaTeX$ command includegraphics
, all images included
using [](some-image.png)
will be impacted. In some cases, this could pose
a problem. Eventually, we hope to find a more robust solution, without
such side-effects. (If you are well versed in $\LaTeX$ and think you can
solve this problem, please get in touch.)
This vignette uses theme rmarkdown::html_vignette
. Its YAML section
looks like this:
--- title: "Summarytools in R Markdown Documents" author: "Dominic Comtois" date: "`r Sys.Date()`" output: html_document: fig_caption: false toc: true toc_depth: 1 css: assets/vignette.css vignette: > %\VignetteIndexEntry{Summarytools in R Markdown Documents} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} %\VignetteDepends{magrittr} %\VignetteDepends{kableExtra} ---
The vignette.css file is copied from the installed rmarkdown package's 'templates/html_vignette/resources' directory.
The following global options for knitr and summarytools have been set. Other options might also be useful to optimize content, but this is a good place to start from.
```r`r ''` library(knitr) opts_chunk$set(comment=NA, prompt=FALSE, cache=FALSE, echo=TRUE, results='asis') st_options(bootstrap.css = FALSE, # Already part of the theme plain.ascii = FALSE, # Essential setting for Rmd style = "rmarkdown", # Essential setting for Rmd dfSummary.silent = TRUE, # Hides redundant messages footnote = NA, # Keeping the results minimal subtitle.emphasis = FALSE) # For the vignette theme, # this gives better results. # For other themes, using # TRUE might be preferable. ```
Finally, summarytools CSS has been included in the following manner, before the setup chunck:
```r`r ''` summarytools::st_css(main = TRUE, global = TRUE) ```
This is by no way a definitive guide; depending on the themes you use, you could find that other settings yield better results. If you are looking to create a Word or a PDF document, you might want to try different combinations of options. If you find problems with the recommended settings or if you find better combinations, you are welcome to open an issue on GitHub to suggest modifications or make a pull request with your own improvements to this vignette.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.