\definecolor{MidnightBlue}{HTML}{2E74B5}
library(knitr) opts_chunk$set(comment = NA, prompt = FALSE, cache = FALSE, echo = TRUE, results = 'asis') library(summarytools)
This document shows how to customize the content of the Stats / Values column
in \color{MidnightBlue}{data frame
summaries}
generated using summarytools::dfSummary()
. This feature was introduced in
version 1.0.0 of
summarytoolsas a response
to a feature request that came up several times, in a
\color{MidnightBlue}{form}
or
\color{MidnightBlue}{another}.
Two new options were created: dfSummary.custom.1
and dfSummary.custom.2
. The
first one has a predefined value -- it is the one that makes up the fourth row
of the cell (showing IQR and CV). The second one is set to NA
by default. If
both options are defined (non-NA
), the cell will now show 5 lines rather than
4, provided there are no additional line feed occurring within the cell, be it
by design or by an "overflow" of one of the custom lines.
We'll use the first column of iris to show results as they are before making any changes. But first, a little bit of setting-up:
\vspace{8pt}
library(summarytools) st_options(plain.ascii = FALSE, headings = FALSE, footnote = NA, round.digits = 1, style = "rmarkdown", # For freq(), descr(), & ctable() dfSummary.varnumbers = FALSE, dfSummary.valid.col = FALSE, dfSummary.silent = TRUE, dfSummary.style = "grid", tmp.img.dir = "img")
Now let's show the default output: \vspace{8pt}
iris_subset <- iris[1] dfSummary(iris_subset, graph.magnif = .45)
Setting dfSummary.custom.1
to NA
will remove the last line in Stats /
Values: \vspace{8pt}
st_options(dfSummary.custom.1 = NA) dfSummary(iris_subset, graph.magnif = .35) # Adjust graph size accordingly
Here we'll create the expression needed to generate new statistics, Q1 & Q3.
The expression is evaluated while looping on column data, and we need to refer
to that data. The variable name to use is, well, column_data
. Another variable
you can use is round.digits
(we've set to 1
in the setup chunk on page 1).
\vspace{8pt}
st_options( dfSummary.custom.1 = expression( paste( "Q1 - Q3 :", round( quantile(column_data, probs = .25, type = 2, names = FALSE, na.rm = TRUE), digits = round.digits ), " - ", round( quantile(column_data, probs = .75, type = 2, names = FALSE, na.rm = TRUE), digits = round.digits ) ) ) ) dfSummary(iris_subset, graph.magnif = .45)
It is always possible to reset the value of dfSummary.custom.1
to its initial
value by using
st_options(dfSummary.custom.1 = "default")
But let's make things a bit more interesting by actually showing IQR (CV)
under Q1 & Q3. For this, we will use the default expression for
dfSummary.custom.1
to define dfSummary.custom.2
:
\vspace{8pt}
st_options( dfSummary.custom.2 = expression( paste( paste0( trs("iqr"), " (", trs("cv"), ") : " ), format_number( IQR(column_data, na.rm = TRUE), round.digits ), " (", format_number( sd(column_data, na.rm = TRUE) / mean(column_data, na.rm = TRUE), round.digits ), ")", collapse = "", sep = "" ) ) ) dfSummary(iris[3:5], graph.magnif = .65) # Again, graph size adjusted
Don't forget to set na.rm = TRUE
whenever necessary (most base R statistics
use it with FALSE
as default).
You may have noticed that instead of round()
, we used format_number()
, which
is a summarytools internal function. It applies not only rounding, but all
relevant formatting attributes as well (nsmall, decimal.mark, big.mark,
scientific,and so on).
As shown in the
\color{MidnightBlue}{Introduction to summarytools}
vignette, the following bit of code can be used to retrieve and format the
expressions stored in the custom options. To achieve good results, the chunk
option results='markup'
was used for this chunk.
st_options(dfSummary.custom.1 = "default") formatR::tidy_source( text = deparse(st_options("dfSummary.custom.1")), indent = 2, args.newline = TRUE )
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.