knitr::opts_chunk$set( collapse = TRUE, comment = "#>", tidy = TRUE ) # to knit "child" Rmd files knitr::opts_knit$set(root.dir = "../") library(formatdown) library(data.table) library(knitr) options( datatable.print.nrows = 15, datatable.print.topn = 3, datatable.print.class = TRUE )
#| echo: false formatdown_options(size = "small") prefix_1 <- c("peta", "tera", "giga", "mega", "kilo") symbol_1 <- c("P", "T", "G", "M", "k") symbol_1 <- format_text(symbol_1, face = "italic") x <- 10^seq(from = 15, to = 3, by = -3) value_1 <- format_numbers(x, digits = 1, format = "engr", omit_power = NULL) value_1 <- sub("1 \\\\times ", "", value_1) prefix_2 <- c("milli", "micro", "nano", "pico", "femto") symbol_2 <- c("m", "\\mu", "n", "p", "f") symbol_2 <- format_text(symbol_2, face = "italic") x <- 10^seq(from = -3, to = -15, by = -3) value_2 <- format_numbers(x, 1, omit_power = NULL) value_2 <- sub("1 \\\\times ", "", value_2) DT <- data.table(prefix_1, symbol_1, value_1, prefix_2, symbol_2, value_2) knitr::kable(DT, align = "lcllcl", col.names = rep(c("Prefix", "Symbol", "Value"), 2) )
The rationale for the formatdown package is formatting numbers in power-of-ten notation in inline R code or tabulated columns of data frames. Other features of the package provide tools for typesetting non-power-of-ten columns to match. In this vignette, we discuss the primary formatting function format_numbers()
and its convenience wrappers for scientific, engineering, and decimal notation.
Notation to represent large and small numbers depends on the mode of communication. In a computer script, for example, we might encode the Avogadro constant as N_A = 6.0221*10^23
. The asterisk (*) and caret (\^) in this expression, however, communicate instructions to a computer, not syntactical mathematics. And while scientific E-notation (6.0221E+23
) has currency in some discourse communities, power-of-ten notation, e.g., $\small N_A = 6.0221 \times 10^{23}$, is the conventional format for professional technical communication.
Power-of-ten notation is expressed,
$$ \small a \times 10^n, $$
where $\small a$ is the coefficient in decimal form and the exponent $\small n$ is an integer. Two formats are in common use [@Chase:2021, 63--67]:
scientific: $\small 1\leq{|a|} < 10$, e.g., $\small N_A =$ r format_sci(6.0221E+23, 5)
.
engineering: $\small 1\leq{|a|} < 1000$ and $\small n$ is a multiple of 3, e.g., $\small N_A =$ r format_engr(6.0221E+23, 5)
.
The utility of the engineering form follows from the SI prefixes for physical units such as "mega-", "kilo-", "milli-", etc., corresponding to powers of 10 that are integer multiples of three.
Notes on syntax. Programming symbols are not necessarily mathematical symbols:
asterisk (*). Programming symbol for multiplication, e.g., x = a * b
or y = a * (b + c)
. In the grammar of mathematics, multiplication is indicated by the symbol ($\times$) when needed.
caret (\^). Programming symbol for exponentiation, e.g., x = y^2
or z = 10^-3
. In the grammar of mathematics, exponents are typeset as superscripts, e.g., $\small x = y^2$ or $\small z = 10^{-3}$.
multiplication ($\small\times$). Mathematical symbol for multiplication, not to be confused with the letter "x". Generally omitted when the meaning is clear, e.g., $\small x=ab$ or $\small y=a(b+c)$, but conventionally included in power-of-ten notation, e.g., $\small 6.0221 \times 10^{23}$. When a comma is used as the decimal marker, multiplication may be indicated by the half-high dot, e.g., $\small 6,0221 \cdot 10^{23}$.
Decimal subsets. In a vector of numbers formatted in power-of-ten form, the decimal form may be preferred for any subset of values with exponents near zero, e.g., $\small n \in {-1, 0, 1, 2}$.
#| echo: false x <- 3.12 * 10^seq(-3, 4) sci <- format_sci(x, 3, omit_power = NA) sci_omit <- format_sci(x, 3) DT <- data.table(sci, sci_omit) knitr::kable(DT, align = "r", caption = "Decimal form may be preferred for a subset", col.names = c( "scientific notation", "subset in decimal form" ) )
Decimal columns. A table of numeric information can include columns formatted in both power-of-ten notation and decimal notation. For example, a table of atmospheric properties shown below has altitude in integer form, temperature in decimal form, and density in power-of-ten engineering notation (except for those values with exponents near zero).
#| echo: false DT <- atmos[alt < 150, .(alt, temp, dens)] DT$alt <- format_dcml(DT$alt, 2) DT$temp <- format_dcml(DT$temp, 5) DT$dens <- format_engr(DT$dens, 3) knitr::kable(DT, align = "r", caption = "Properties of the atmosphere", col.names = c( "Altitude (km)", "Temperature (K)", "Density (kg/m$^3$)" ) ) formatdown_options(reset = TRUE)
The purpose of the decimal format in formatdown is to match the font face and size of decimal columns to those of the power-of-ten columns. If no power-of-ten columns are used, of course, decimal columns can be displayed as-is or formatted using other R tools.
Packages. If you are writing your own script to follow along, we use the following packages in this vignette. Data frame operations are performed with data.table syntax. Some users may wish to translate the examples to use base R or dplyr syntax.
#| echo: true #| eval: false library("formatdown") library("data.table") library("knitr")
We format numbers as inline math expressions delimited by $ ... $
or the optional \( ... \)
. For example, the Avogadro constant is marked up as
$6.0221 \times 10^{23}$,
where the \times
macro creates the multiplication symbol ($\small\times$). This math markup, as an inline equation in an R markdown document, renders as: $\small 6.0221 \times 10^{23}$. To program the markup, however, we enclose it in quote marks as a character string, that is,
"$6.0221 \\times 10^{23}$",
which requires us to "escape" the backslash in \times
by adding an extra backslash. When the optional font size argument is assigned, formatdown adds a LaTeX-style sizing macro such as \small
or \large
, for example,
"$\\small 6.0221 \\times 10^{23}$",
where again the markup includes an extra backslash.
format_sci()
{#format_sci}Converts numbers to character strings in power-of-ten form,
"$a \\times 10^{n}$"
where $\small a$ is the coefficient and $\small n$ is the exponent. format_sci()
is a wrapper for the more general function format_numbers()
. For a subset of values with exponents near zero, e.g., $\small n \in {-1, 0, 1, 2}$, the output is in decimal form,
"$a$"
Usage.
format_sci(x, digits = 4, ..., omit_power = c(-1, 2), set_power = NULL, delim = formatdown_options("delim"), size = formatdown_options("size"), decimal_mark = formatdown_options("decimal_mark"), small_mark = formatdown_options("small_mark"), small_interval = formatdown_options("small_interval"), whitespace = formatdown_options("whitespace"))
...
) must be named. formatdown_options()
can be reset by the user locally in a function call or globally using formatdown_options()
. Examples. These early examples are shown with default arguments. Arguments are explored more fully starting with Numeric input section.
# 1. Avogadro constant L <- 6.0221E+23 format_sci(L) # 2. Elementary charge e <- 1.602176634e-19 format_sci(e)
Examples 1 and 2 (in inline code chunks) render as,
r format_sci(L, size = "small")
$\small \mathit{mol}^{-1}$.r format_sci(e, size = "small")
$\small C$. format_engr()
{#format_engr}Similar to format_sci()
except using engineering notation, i.e., exponents are multiples of 3.
Usage.
format_engr(x, digits = 4, ..., omit_power = c(-1, 2), set_power = NULL, delim = formatdown_options("delim"), size = formatdown_options("size"), decimal_mark = formatdown_options("decimal_mark"), small_mark = formatdown_options("small_mark"), small_interval = formatdown_options("small_interval"), whitespace = formatdown_options("whitespace"))
Examples. (with default arguments)
# 3. Avogadro constant format_engr(L) # 4. Elementary charge format_engr(e)
Examples 3 and 4 render as,
r format_engr(L, size = "small")
$\small \mathit{mol}^{-1}$.r format_engr(e, size = "small")
$\small C$. format_dcml()
{#format_dcml}A wrapper for the more general function format_numbers()
; converts numbers to character strings in decimal form,
"$a$"
where $\small a$ is the decimal value.
Usage.
format_dcml(x, digits = 4, ..., size = formatdown_options("size"), delim = formatdown_options("delim"), decimal_mark = formatdown_options("decimal_mark"), big_mark = formatdown_options("big_mark"), big_interval = formatdown_options("big_interval"), small_mark = formatdown_options("small_mark"), small_interval = formatdown_options("small_interval"), whitespace = formatdown_options("whitespace"))
Examples. (with default arguments)
# 5. Speed of light in a vacuum c <- 299792458 format_dcml(c) # 6. Molar gas constant R <- 8.31446261815324 format_dcml(R)
Examples 5 and 6 render as,
r format_dcml(c, size = "small")
$\small\mathit{m/s}$.r format_dcml(R, size = "small")
$\small\mathit{J}\cdot\mathit{K}^{-1}\mathit{mol}^{-1}$. format_numbers()
{#format_numbers}format_numbers()
is the general-purpose formatting function called by format_sci()
, format_engr()
, and format_dcml()
. The general function can be used instead of the convenience functions simply by setting its format
argument to "sci"
, "engr"
(default), or "dcml"
.
Usage.
format_numbers(x, digits = 4, format = "engr", ..., omit_power = c(-1, 2), set_power = NULL, delim = formatdown_options("delim"), size = formatdown_options("size"), decimal_mark = formatdown_options("decimal_mark"), big_mark = formatdown_options("big_mark"), small_mark = formatdown_options("small_mark"), big_interval = formatdown_options("big_interval"), small_interval = formatdown_options("small_interval"), whitespace = formatdown_options("whitespace"))
Examples. Reproducing some of the earlier examples using format_numbers()
.
# 7. Scientific format_numbers(L, format = "sci") # 8. Engineering format_numbers(e, format = "engr") # 9. Decimal format_numbers(R, format = "dcml")
Examples 7--9 render as,
r format_numbers(L, format = "sci", size = "small")
$\small \mathit{mol}^{-1}$. r format_numbers(e, format = "engr", size = "small")
$\small C$. r format_dcml(R, size = "small")
$\small\mathit{J}\cdot\mathit{K}^{-1}\mathit{mol}^{-1}$. This section begins our detailed discussion of arguments.
Scalar input. Generally used with inline R code. For example, the following R markdown sentence, which includes some math markup and some inline R code,
The Avogadro constant is $L = $ `r format_sci(L)` $\mathit{mol}^{-1}$.
renders as: The Avogadro constant is $\small L =$ r format_sci(L, size = "small")
$\small\mathit{mol}^{-1}$.
Vector. A vector of numbers (or a data frame column) is marked up as follows,
# 10. Sample vector x <- c(2.3333e-5, 3.4444e-4, 5.2222e-2, 6.3333e-1, 8.1111e+1, 9.2222e+2, 2.4444e+4, 3.1111e+5, 4.2222e+6) format_engr(x)
In a table, the output renders as,
#| echo: false formatdown_options(size = "small")
DT <- data.table(x, format_engr(x)) knitr::kable(DT, align = "r", col.names = c("Unformatted", "Engr notation"), caption = "Example 10." )
#| echo: false formatdown_options(reset = TRUE)
For values with exponents $\small n\in{-1, 0, 1, 2}$, the default format is decimal; see Excluding exponents.
The units R package (website: Measurement Units for R) provides measurement units for R vectors, converting vectors of class "numeric" to class "units" [@Pebesma+Mailund+Hiebert:2016:units]. For example
# Number x <- 10320 class(x) # Convert to units class units(x) <- "m" x class(x) # Operations are reflected in the values and its units y <- x^2 y # Unit conversion is supported z <- y z units(z) <- "ft^2" z
If an input argument to format_numbers()
(or its convenience functions) is of class "units", formatdown attempts to extract the units character string, format the number in the expected way, and append a units character string to the result. For example,
# 11. Units-class inputs format_sci(x) format_sci(y) format_sci(z)
Example 11 renders as,
r format_sci(x, size = "small")
r format_sci(y, size = "small")
r format_sci(z, size = "small")
More complicated units can be managed. For example the Newtonian gravitational constant could be formatted as follows, where the exponents in the units definition are given in "implicit" form, that is, where $\small m^3 kg^{-1} s^{-2}$ is represented by "m3 kg-1 s-2"
.
G <- 6.6743e-11 units(G) <- "m3 kg-1 s-2" format_sci(G)
Applying a similar procedure to several physical constants and collecting the results in a data frame yields,
#| echo: false c <- 299792458 units(c) <- "m/s" format_c <- format_sci(c, size = "small") h <- 6.62607015e-34 units(h) <- "J/Hz" format_h <- format_sci(h, size = "small") mu <- 1.25663706212e-6 units(mu) <- "N A-2" format_mu <- format_sci(mu, size = "small") G <- 6.67430e-11 units(G) <- "m3 kg-1 s-2" format_G <- format_sci(G, size = "small") ke <- 8.9875517923e+9 units(ke) <- "N m2 C-2" format_ke <- format_sci(ke, size = "small") sigma <- 5.67037442e-8 units(sigma) <- "W m-2 K-4" format_sigma <- format_sci(sigma, size = "small") symbol <- c( "$\\small c$", "$\\small h$", "$\\small \\mu_0$", "$\\small G$", "$\\small k_e$", "$\\small \\sigma$" ) quantity <- format_text( c( "speed of light in a vacuum", "Planck constant", "vacuum magnetic permeability", "Newtonian gravitational constant", "Coulomb constant", "Stefan-Boltzmann constant" ), size = "small" ) formatted_value <- c( format_c, format_h, format_mu, format_G, format_ke, format_sigma ) DT <- data.table(symbol, quantity, formatted_value) knitr::kable(DT)
This table is constructed simply to illustrate how formatdown returns a variety of units-class values with units appended to the formatted number.
In a typical application, however, the numbers in a column have the same physical units and are formatted as a vector. For example,
#| echo: false formatdown_options(size = "small")
# Example 12 DT <- air_meas[, .(temp, pres, sp_gas, dens)] # Examine data DT[] # Assign units units(DT$temp) <- "K" units(DT$pres) <- "Pa" units(DT$sp_gas) <- "J kg-1 K-1" units(DT$dens) <- "kg m-3" # Format one column at a time DT$temp <- format_dcml(DT$temp) DT$pres <- format_engr(DT$pres) # Or format multiple columns in one pass cols <- c("sp_gas", "dens") DT[, (cols) := lapply(.SD, format_dcml), .SDcols = cols] knitr::kable(DT, align = "r", caption = "Example 12.")
#| echo: false formatdown_options(reset = TRUE)
Significant digits are applied to the input argument using the base R function signif()
before additional formatting is applied. For example,
# 13. Significant digits format_sci(e, digits = 5) format_sci(e, digits = 4) format_sci(e, digits = 3)
Example 13 renders as,
r format_sci(e, digits = 5, size = "small")
r format_sci(e, digits = 4, size = "small")
r format_sci(e, digits = 3, size = "small")
The format
argument appears in format_numbers()
only. The default is "engr". The format is preset in the format_dcml()
, format_engr()
, and format_sci()
convenience functions.
To compare the effects across many orders of magnitude, we display the same vector in different formats.
#| echo: false formatdown_options(size = "small")
# 14. Comparing formats x <- c(2.3333e-5, 3.4444e-4, 5.2222e-2, 6.3333e-1, 8.1111e+1, 9.2222e+2, 2.4444e+4, 3.1111e+5, 4.2222e+6) dcml <- format_numbers(x, 3, format = "dcml") sci <- format_numbers(x, 3, format = "sci") engr <- format_numbers(x, 3, format = "engr") DT <- data.table(dcml, sci, engr) knitr::kable(DT, align = "r", col.names = c("decimal", "scientific", "engineering"), caption = "Example 14." )
#| echo: false formatdown_options(reset = TRUE)
The values displayed without powers-of-ten notation in the scientific and engineering columns are determined by the omit_power
argument discussed next.
When specifying power-of-ten notation, numbers with exponents lying within the range of the omit_power
argument are typeset in decimal form. In engineering notation, the exponent is checked for lying within the range before and after the conversion to multiple-of-3 exponents.
To illustrate, we compare two omit_power
settings in both scientific and engineering formats. In some columns, we set omit_power = NULL
, which imposes power-of-ten notation on the entire vector.
#| echo: false formatdown_options(size = "small")
# 15. Effects of omit_power DT <- atmos[3:12, .(pres)] DT[, sci_all := format_sci(pres, 3, omit_power = NULL)] DT[, sci_omit := format_sci(pres, 3, omit_power = c(-1, 0))] DT[, engr_all := format_engr(pres, 3, omit_power = NULL)] DT[, engr_omit := format_engr(pres, 3, omit_power = c(-1, 0))] knitr::kable(DT, align = "r", col.names = c( "Unformatted", "all scientific", "scientific w/ omit", "all engineering", "engineering w/ omit" ), caption = "Example 15." )
Comments:
omit_power
range $\small {-1, 0}$. If a single value is assigned, e.g., omit_power = 0
, the argument is interpreted as c(0, 0)
.
# 16. Omit power used for a single value of exponent DT <- atmos[3:12, .(pres)] DT[, sci_all := format_sci(pres, 3, omit_power = NULL)] DT[, sci_omit := format_sci(pres, 3, omit_power = 0)] DT[, engr_all := format_engr(pres, 3, omit_power = NULL)] DT[, engr_omit := format_engr(pres, 3, omit_power = 0)] knitr::kable(DT, align = "r", col.names = c( "Unformatted", "all scientific", "scientific w/ omit", "all engineering", "engineering w/ omit" ), caption = "Example 16." )
#| echo: false formatdown_options(reset = TRUE)
Setting omit_power = c(-Inf, Inf)
yields the same decimal result as format = "dcml"
and overrides any other format
setting. For example,
# 17. Different ways of creating a decimal format (y <- 6.78e-3) (p <- format_numbers(y, 3, "sci", omit_power = c(-Inf, Inf))) (q <- format_numbers(y, 3, "dcml")) (r <- format_dcml(y, 3)) all.equal(p, q) all.equal(p, r)
Example 17 (all cases) renders as,
r format_numbers(y, 3, "dcml", size = "small")
When values in a table column span only a few orders of magnitude, an audience is sometimes better served by setting the notation to a constant power of ten. For example, here we show numbers in scientific format and compare to columns in which the exponents are set to fixed values. Assigning a value to set_power
overrides omit_power
and format
.
#| echo: false formatdown_options(size = "small")
# 18. set_power argument DT <- atmos[alt <= 40, .(alt, pres, dens)] DT[, sci_pres := format_sci(pres, 3, omit_power = c(-1, 2))] DT[, set_pres := format_sci(pres, 3, omit_power = c(-1, 2), set_power = 3)] DT[, sci_dens := format_engr(dens, 3, omit_power = c(-1, 2))] DT[, set_dens := format_engr(dens, 3, omit_power = c(-1, 2), set_power = -2)] DT[, pres := NULL] DT[, dens := NULL] knitr::kable(DT, align = "r", col.names = c("Altitude (km)", "Pressure (Pa)", "with set_power", "Density (kg/m$^{3}$)", "with set_power"), caption = "Example 18." )
#| echo: false formatdown_options(reset = TRUE)
Arguments assigned using formatdown_options()
are described in the Global settings article.
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.