This tutorial introduces the isorunN2O packages and provides examples of how it works and can be used. Please install isorunN2O following the instructions on GitHub, the newest version of this tutorial can always be loaded up as a vignette directly in R by calling vignette("N2O_data_reduction_tutorial") in the command line.

Loading test data

The package includes an example data set (test_run) to work with for testing and demonstration purposes. Because the original data files would be too big to include, it is stored as the cached compacted data set that the iso_save() command from the isoreader package creates from the raw data files. When you run this on your own data sets, simply change the root_folder to point to where you keep all your data (absolute path or relative to your current working directory), e.g. root_folder <- file.path("MAT", "results") and the run_folders to all the run folders you want to read in (can be one or multiple). The first time you load your own data using isoreader it may take a few minutes (speed it up using the parallel = TRUE parameter) but afterwards it will always be fast because it is already cached. The following data was read using isoreader version r packageVersion("isoreader").

library(isoreader)
root_folder <- system.file("extdata", package = "isorunN2O") 
run_folders <- "test_run" # could be multiple, e.g. run_folders <- c("run1", "run2")
iso_files <- iso_read_continuous_flow(run_folders, root = root_folder) %>%
  # filter out files that have reading errors
  iso_filter_files_with_problems()

Raw data

The iso_files variable now holds all your raw data, you can look at the names of all the loaded files by running the following (here only the first 5 for brevity, also note that we're using the %>% pipe operator to pass output from one function to the next, which might look a little strange at first but makes it more readable further on):

names(iso_files) %>% head(n=5) 
# install isoprocessor from github
if (!requireNamespace("devtools", quietly = TRUE)) 
  install.packages("devtools")
if (!requireNamespace("isoprocessor", quietly = TRUE))
  devtools::install_github("isoverse/isoprocessor")

You can use the file names to take a look at specific chromatograms using the functionality provided in the isoprocessor package (version r packageVersion("isoprocessor")).

library(isoprocessor)
iso_files$`MAT25392080_P02E_run02_Conditioner-0000.dxf` %>% 
  iso_plot_continuous_flow_data(color = data, panel = NULL)

Explore

If you'd like to explore the chromatograms (or really any of the data extracted from the raw data files) more, you can visually explore some of the core functionality of isoreader and isoprocessor using the isoviewer (version r packageVersion("isoprocessor")) graphical user interface with the command below (use the Close button in the GUI to return to the interactive R session):

# install isoviewr if not 
if (!requireNamespace("devtools", quietly = TRUE)) 
  install.packages("devtools")
if (!requireNamespace("isoprocessor", quietly = TRUE))
  devtools::install_github("isoverse/isoviewer")
isoviewer::iso_start_viewer()

Processing

Data processing makes use of generic data table and isotope functionality provided by the dplyr, isoreader and isoprocessor packages as well as N2O specific tools implemented in the isorunN2O package.

library(dplyr)
library(isoreader)
library(isoprocessor)
library(isorunN2O)

Data processing step 1 (first look)

In the first step, parse the file info from the sequence (take a look at what's available with iso_get_file_info(iso_files)), pull out the peak table from the iso files (with the file info), then focus only the N2O peak (no need for the references), and focus on the main columns we are interested in. Everything is chained together with the pipe %>% operator for better readability.

df.raw <- iso_files %>%
  # extract all sample information (former parse_file_names)
  iso_mutate_file_info(
    folder = basename(dirname(file_path)),
    date = file_datetime,
    analysis = Analysis,
    run_number = parse_integer(Row),
    # category is the first part of the `Identifier 1`
    category = extract_word(`Identifier 1`, include_underscore = TRUE, include_dash = TRUE),
    # name is the full `Identifier 1` value
    name = `Identifier 1`,
    # valume information is stored in `Identifier 2`
    volume = parse_number(`Identifier 2`)
  ) %>% 
  # aggregate peak table
  iso_get_vendor_data_table(include_file_info = everything()) %>%
  # select N2O peak
  select_N2O_peak( c(360, 370)) %>% 
  # select all relevant columns 
  select_columns(folder:volume, area = `Intensity All`, d45 = `d 45N2O/44N2O`, d46 = `d 46N2O/44N2O`)

Now to get a sense for what the data looks like, let's look at the first couple of rows. To look at the complete data frame, you can always call View(df.raw) or double click on the name in the Environment tab on the upper right.

df.raw %>% head(n=5) 

To check the category makeup of your run, make use of some handy dplyr functionality:

df.raw %>% group_by(category) %>% tally()

Additionally, isoprocessor provides convenience functions for inspecting the data including iso_summarize_data_table(). Formatting options for data tables are provided by the function kable from the knitr package, which we'll use here to get column style output.

df.raw %>% group_by(category) %>% 
  # summarize area, d45 and d46 for each category
  iso_summarize_data_table(area, d45, d46) %>%
  # format for easier display
  knitr::kable()

To further hone in on different data groups, simply modify the group_by:

df.raw %>% group_by(category, name) %>% 
  iso_summarize_data_table(area, d45, d46, cutoff = 3)%>% 
  knitr::kable()

For a visual first look at the data, you can use versatile ?iso_plot_data function, which generates a ggplot:

library(ggplot2)
df.raw %>% iso_plot_data(
  x = run_number, y = d45, points = TRUE,
  # shape = 21 with fill makes data points with black borders
  shape = 21, fill = category, 
  panel = category ~ .
)

or a little bit more elaborate specifying in more detail how to color/fill and panel the overview plot:

df.raw %>% iso_plot_data(
  x = run_number, y = d45, points = TRUE,
  size = area,
  shape = 21, fill = c(type = ifelse(category %in% c("IAEA-NO3", "USGS-34"), name, category)), 
  panel = factor(category, levels = c("N2O", "IAEA-NO3", "USGS-34")) ~ .
)

or as an interactive plot (mouse-over information and zooming), which is a little easier for data exploration (make_interactive() makes the last plot interactive by default):

library(plotly)
ggplotly(dynamicTicks = TRUE)

Data processing step 2 (continued)

From the first look it is clear that there are couple of things we need to consider, there is one sample that was marked as questionable during injection (#68) which we'd like to exclude for now, there were also a couple of samples that were controls rather than standards and should go into their own category. Lastly, it appears there is some drift so we will want to evaluate that.

Categories

df.cat <- df.raw %>% 
  change_category(run_number == 68, "excluded") %>%
  change_category(name %in% c("IAEA-NO3 37 uM ctrl", "USGS-34 37 uM ctrl"), "control")

Drift correction

The evaluate_drift function provides a number of different strategies for evaluating drift using different correction methods, here we're trying a polynomial fit (method = "loess") and are correcting with the standards as well as N2O. We also want to see a summary plot of the drift using plot = TRUE (the default), which will plot the drift polynomials on top of the original data (normalized to average isotope values in each group) and the residuals after applying the correction. For details look at the ?evaluate_drift help. The drift correction stores the drift corrected values in d45.drift and d46.drift.

df.drift <- df.cat %>% 
  evaluate_drift(
    d45, d46, correct = TRUE, plot = TRUE,
    correct_with = category %in% c("USGS-34", "IAEA-NO3", "N2O"),
    method = "loess"
  )

Let's take a quick look how we're doing after drift correction:

df.drift %>% 
  iso_plot_data(
    x = run_number, y = d45, points = TRUE,
    shape = 21, fill = category,
    panel = factor(category, levels = c("N2O", "IAEA-NO3", "USGS-34")) ~ .
  )

Data processing step 3 (continued)

Now that we're drift corrected, time to switch to $\delta^{15}N$ and $\delta^{18}O$ space and calibrate against our standards.

O17 correction

We're doing the O17 correction here (instead of before the drift) but it is a matter of discussion whether drift correction or O17 correction should be applied first. The O17 correction introduces new columns d15.raw and d18.raw.

df.O17 <- df.drift %>%
  correct_N2O_for_17O(d45.drift, d46.drift) %>% 
  # no longer need these columns now that we're in d15 and d18 space
  select_columns(-d45, -d45.drift, -d46, -d46.drift) 

Calibration

Last steps are calculating the background (see ?calculate_background), calculating concentrations (see ?calculate_concentrations) and then calibrating $\delta^{15}N$ and $\delta^{18}O$ (see ?calibrate_d15 and ?calibrate_d18). Note that the background calculation is not currently used for calibration since only multi-point calibration is implemented but it's a good check to see its value.

df.cal <- df.O17 %>% 
  calculate_background(area) %>%
  calculate_concentrations(area, volume, conc_pattern = "(\\d+)uM", 
                           standards = category %in% c("USGS-34", "IAEA-NO3")) %>% 
  calibrate_d15(d15.raw, standards = c(`USGS-34` = -1.8, `IAEA-NO3` = 4.7)) %>%
  calibrate_d18(d18.raw, cell_volume = 1.5, standards = c(`USGS-34` = -27.93, `IAEA-NO3` = 25.61))

Summary

At the end of the data processing, there are a couple of ways to summarize the data, including the iso_summmarize_data_table() introduced earlier (here used to compare the raw vs. calibrated values with different groupings), but also generate_parameter_table, which summarizes all the parameters recorded from the data processing calls:

df.cal %>% group_by(category) %>% 
  iso_summarize_data_table(cutoff = 3, d15.raw, d15.cal, d18.raw, d18.cal) %>% 
  arrange(desc(n)) %>%
  knitr::kable() 

df.cal %>% 
  generate_parameter_table() %>% 
  knitr::kable() 

And of course visually, e.g. in an interactive plot with additional mouseover info using the label parameter and iso_format function:

df.cal %>% 
  iso_plot_data(
    x = run_number, y = d15.cal, points = TRUE,
    size = amount,
    shape = 21, fill = c(type = ifelse(category %in% c("IAEA-NO3", "USGS-34"), name, category)),
    panel = factor(category, levels = c("N2O", "IAEA-NO3", "USGS-34")) ~ .,
    label = c(info = iso_format(
      NULL = name,
      d15 = round(d15.cal, 2),
      d18 = round(d18.cal, 2),
      amount = round(amount, 3)
    ))
  ) %>%
  ggplotly(dynamicTicks = TRUE, tooltip = c("fill", "label"))

And some simpler single data plots (using filter from the dplyr pakcage) to look specifically at the samples and DPR control.

df.cal %>% 
  filter(category %in% c("P02E", "DPR")) %>% 
  iso_plot_data(
    x = run_number, y = c(d15.cal, d18.cal), points = TRUE,
    shape = 21, fill = c(info = paste(category, panel))
  ) %>% 
  ggplotly(dynamicTicks = TRUE)

Additional customization is also possible using ggplot functionality, for example can use the shape parameter for symbol differentiation, and to visualize all the standards' key values in separate panels, can make use of facet_wrap:

plot <-
  df.cal %>% 
  filter(category %in% c("IAEA-NO3", "USGS-34")) %>%
  iso_plot_data(
    x = run_number, y = c(amount, d15.cal, d18.cal), points = TRUE,
    shape = name, fill = name, # use multiple shapes, define with scale_shape_manual
    label = c(info = iso_format(
      `#` = run_number,
      d15 = round(d15.cal, 2),
      d18 = round(d18.cal, 2)
    ))
  ) +
  facet_wrap(panel ~ category, scales = "free", ncol = 2) +
  scale_shape_manual(values = 21:25)

ggplotly(p = plot, dynamicTicks = TRUE, tooltip = c("fill", "label"))

Export data

At any point during the process, if you like to export data as excel, this is easy with iso_export_data_to_excel (here using a couple of filter options in select to skip parameters and raw values, and using arrange to sort the data):

df.cal %>% 
  select(-starts_with("p."), -ends_with(".raw"), -ends_with(".drift"), d15 = d15.cal, d18 = d18.cal) %>%
  arrange(category, name) %>%
  iso_export_data_to_excel(filepath = "export.xlsx")


sebkopf/isorunN2O documentation built on April 18, 2021, 6:57 p.m.