fmt_chem: Format chemical formulas

View source: R/format_data.R

fmt_chemR Documentation

Format chemical formulas

Description

fmt_chem() lets you format chemical formulas or even chemical reactions in the table body. Often the input text will be in a common form representing single compounds (like "C2H4O", for acetaldehyde) but chemical reactions can be used (e.g., ⁠2CH3OH -> CH3OCH3 + H2O"⁠). So long as the text within the targeted cells conforms to gt's specialized chemistry notation, the appropriate conversions will occur. Details pertaining to chemistry notation can be found in the section entitled How to use gt's chemistry notation.

Usage

fmt_chem(data, columns = everything(), rows = everything())

Arguments

data

The gt table data object

⁠obj:<gt_tbl>⁠ // required

This is the gt table object that is commonly created through use of the gt() function.

columns

Columns to target

<column-targeting expression> // default: everything()

Can either be a series of column names provided in c(), a vector of column indices, or a select helper function (e.g. starts_with(), ends_with(), contains(), matches(), num_range() and everything()).

rows

Rows to target

<row-targeting expression> // default: everything()

In conjunction with columns, we can specify which of their rows should undergo formatting. The default everything() results in all rows in columns being formatted. Alternatively, we can supply a vector of row captions within c(), a vector of row indices, or a select helper function (e.g. starts_with(), ends_with(), contains(), matches(), num_range(), and everything()). We can also use expressions to filter down to the rows we need (e.g., ⁠[colname_1] > 100 & [colname_2] < 50⁠).

Value

An object of class gt_tbl.

How to use gt's chemistry notation

The chemistry notation involves a shorthand of writing chemical formulas and chemical reactions, if needed. It should feel familiar in its basic usage and the more advanced typesetting tries to limit the amount of syntax needed. It's always best to show examples on usage:

  • "CH3O2" and "(NH4)2S" will render with subscripted numerals

  • Charges can be expressed with terminating "+" or "-", as in "H+" and "[AgCl2]-"; if any charges involve the use of a number, the following incantations could be used: "CrO4^2-", "Fe^n+", "Y^99+", "Y^{99+}" (the final two forms produce equivalent output)

  • Stoichiometric values can be included with whole values prepending formulas (e.g., "2H2O2") or by setting them off with a space, like this: "2 H2O2", "0.5 H2O", "1/2 H2O", "(1/2) H2O"

  • Certain standalone, lowercase letters or combinations thereof will be automatically stylized to fit conventions; "NO_x" and "x Na(NH4)HPO4" will have italicized 'x' characters and you can always italicize letters by surrounding with "*" (as in "*n* H2O" or "*n*-C5H12")

  • Chemical isotopes can be rendered using either of these two constructions preceding an element: "^{227}_{90}Th" or "^227_90Th"; nuclides can be represented in a similar manner, here are two examples: "^{0}_{-1}n^{-}", "^0_-1n-"

  • Chemical reactions can use "+" signs and a variety of reaction arrows: (1) "A -> B", (2) "A <- B", (3) "A <-> B", (4) "A <--> B", (5) "A <=> B", (6) "A <=>> B", or (7) "A <<=> B"

  • Center dots (useful in addition compounds) can be added by using a single "." or "*" character, surrounded by spaces; here are two equivalent examples "KCr(SO4)2 . 12 H2O" and "KCr(SO4)2 * 12 H2O"

  • Single and double bonds can be shown by inserting a "-" or "=" between adjacent characters (i.e., these shouldn't be at the beginning or end of the markup); two examples: "C6H5-CHO", "CH3CH=CH2"

  • as with units notation, Greek letters can be inserted by surrounding the letter name with ":"; here's an example that describes the delta value of carbon-13: ":delta: ^13C"

Examples

Let's use the reactions dataset and create a new gt table. The table will be filtered down to only a few rows and columns. The column cmpd_formula contains chemical formulas and the formatting of those will be performed by fmt_chem(). Certain column labels with chemical names (o3_k298 and no3_k298) can be handled within cols_label() by using surrounding the text with "{{%"/"%}}".

reactions |>
  dplyr::filter(cmpd_type == "terminal monoalkene") |>
  dplyr::filter(grepl("^1-", cmpd_name)) |>
  dplyr::select(cmpd_name, cmpd_formula, ends_with("k298")) |>
  gt() |>
  tab_header(title = "Gas-phase reactions of selected terminal alkenes") |>
  tab_spanner(
    label = "Reaction Rate Constant at 298 K",
    columns = ends_with("k298")
  ) |>
  fmt_chem(columns = cmpd_formula) |>
  fmt_scientific() |>
  sub_missing() |>
  cols_label(
    cmpd_name = "Alkene",
    cmpd_formula = "Formula",
    OH_k298 = "OH",
    O3_k298 = "{{%O3%}}",
    NO3_k298 = "{{%NO3%}}",
    Cl_k298 = "Cl"
  ) |>
  opt_align_table_header(align = "left")
This image of a table was generated from the first code example in the `fmt_chem()` help file.

Taking just a few rows from the photolysis dataset, let's create a new gt table. The cmpd_formula and products columns both contain text in chemistry notation (the first has compounds, and the second column has the products of photolysis reactions). These columns will be formatted by fmt_chem(). The compound formulas will be merged with the compound names with cols_merge().

photolysis |>
  dplyr::filter(cmpd_name %in% c(
    "hydrogen peroxide", "nitrous acid",
    "nitric acid", "acetaldehyde",
    "methyl peroxide", "methyl nitrate",
    "ethyl nitrate", "isopropyl nitrate"
  )) |>
  dplyr::select(-c(l, m, n, quantum_yield, type)) |>
  gt() |>
  tab_header(title = "Photolysis pathways of selected VOCs") |>
  fmt_chem(columns = c(cmpd_formula, products)) |>
  cols_nanoplot(
    columns = sigma_298_cm2,
    columns_x_vals = wavelength_nm,
    expand_x = c(200, 400),
    new_col_name = "cross_section",
    new_col_label = "Absorption Cross Section",
    options = nanoplot_options(
      show_data_points = FALSE,
      data_line_stroke_width = 4,
      data_line_stroke_color = "black",
      show_data_area = FALSE
    )
  ) |>
  cols_merge(
    columns = c(cmpd_name, cmpd_formula),
    pattern = "{1}, {2}"
  ) |>
  cols_label(
    cmpd_name = "Compound",
    products = "Products"
  ) |>
  opt_align_table_header(align = "left")
This image of a table was generated from the second code example in the `fmt_chem()` help file.

fmt_chem() can handle the typesetting of nuclide notation. Let's take a subset of columns and rows from the nuclides dataset and make a new gt table. The contents of the nuclide column contains isotopes of hydrogen and carbon and this is placed in the table stub. Using fmt_chem() makes it so that the subscripted and superscripted values are properly formatted to the convention of formatting nuclides.

nuclides |>
  dplyr::filter(element %in% c("H", "C")) |>
  dplyr::mutate(nuclide = gsub("[0-9]+$", "", nuclide)) |>
  dplyr::select(nuclide, atomic_mass, half_life, decay_1, is_stable) |>
  gt(rowname_col = "nuclide") |>
  tab_header(title = "Isotopes of Hydrogen and Carbon") |>
  tab_stubhead(label = "Isotope") |>
  fmt_chem(columns = nuclide) |>
  fmt_scientific(columns = half_life) |>
  fmt_number(
    columns = atomic_mass,
    decimals = 4,
    scale_by = 1 / 1e6
  ) |>
  sub_missing(
    columns = half_life,
    rows = is_stable,
    missing_text = md("**STABLE**")
  ) |>
  sub_missing(columns = half_life, rows = !is_stable) |>
  sub_missing(columns = decay_1) |>
  data_color(
    columns = decay_1,
    target_columns = c(atomic_mass, half_life, decay_1),
    palette = "LaCroixColoR::PassionFruit",
    na_color = "white"
  ) |>
  cols_label_with(fn = function(x) tools::toTitleCase(gsub("_", " ", x))) |>
  cols_label(decay_1 = "Decay Mode") |>
  cols_width(
    stub() ~ px(70),
    c(atomic_mass, half_life, decay_1) ~ px(120)
  ) |>
  cols_hide(columns = c(is_stable)) |>
  cols_align(align = "center", columns = decay_1) |>
  opt_align_table_header(align = "left") |>
  opt_vertical_padding(scale = 0.5)
This image of a table was generated from the third code example in the `fmt_chem()` help file.

Function ID

3-20

Function Introduced

v0.11.0 (July 9, 2024)

See Also

Other data formatting functions: data_color(), fmt(), fmt_auto(), fmt_bins(), fmt_bytes(), fmt_country(), fmt_currency(), fmt_date(), fmt_datetime(), fmt_duration(), fmt_email(), fmt_engineering(), fmt_flag(), fmt_fraction(), fmt_icon(), fmt_image(), fmt_index(), fmt_integer(), fmt_markdown(), fmt_number(), fmt_partsper(), fmt_passthrough(), fmt_percent(), fmt_roman(), fmt_scientific(), fmt_spelled_num(), fmt_tf(), fmt_time(), fmt_units(), fmt_url(), sub_large_vals(), sub_missing(), sub_small_vals(), sub_values(), sub_zero()


rstudio/gt documentation built on Jan. 9, 2025, 10:01 a.m.