Operations

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

Introduction

Isoreader provides a number of general purpose operations that work on all supported IRMS data formats such as caching of files, parallel processing and catching read errors. This vignette demonstrates some of these general operations.

# load isoreader package
library(isoreader)

Supported file types

# list all suported file types
iso_get_supported_file_types() %>%
  dplyr::select(extension, software, description, type) %>%
  knitr::kable()

Messages

By default, isoreader is quite verbose to let the user know what is happening. However, most functions can be silenced by adding the parameter quiet = TRUE to the function call. This can also be done globally using iso_turn_info_messages_off()

# read a file in the default verbose mode
iso_get_reader_example("dual_inlet_example.did") %>%
  iso_read_dual_inlet() %>%
  iso_select_file_info(file_datetime, `Identifier 1`) %>%
  iso_get_file_info() %>%
  knitr::kable()

# read the same file but make the read process quiet
iso_get_reader_example("dual_inlet_example.did") %>%
  iso_read_dual_inlet(quiet = TRUE) %>%
  iso_select_file_info(file_datetime, `Identifier 1`) %>%
  iso_get_file_info() %>%
  knitr::kable()

# read the same file but turn all isoreader messages off
iso_turn_info_messages_off()
iso_get_reader_example("dual_inlet_example.did") %>%
  iso_read_dual_inlet(quiet = TRUE) %>%
  iso_select_file_info(file_datetime, `Identifier 1`) %>%
  iso_get_file_info() %>%
  knitr::kable()

# turn message back on
iso_turn_info_messages_on()

Caching

By default, isoreader caches files as R objects to make access faster in the future. This feature can be turned off if you want to force a fresh read from the source file. Alternatively, you can clear the entire isoreader cache in your working directory to clean up previous file reads.

# cleanup reader cache
iso_cleanup_reader_cache()

# read a new file (notice the time elapsed)
cf_file <- iso_get_reader_example("continuous_flow_example.dxf") %>%
  iso_read_continuous_flow()

# re-read the same file much faster (it will be read from cache)
cf_file <- iso_get_reader_example("continuous_flow_example.dxf") %>%
    iso_read_continuous_flow()

# turn reader caching off
iso_turn_reader_caching_off()

# re-read the same file (it will NOT be read from cache)
cf_file <- iso_get_reader_example("continuous_flow_example.dxf") %>%
  iso_read_continuous_flow()

# turn reader caching back on
iso_turn_reader_caching_on()

Parallel processing

Isoreader supports parallel processing of data files based on the number of processors available in a computer simply by setting the parallel = TRUE flag in any file read operation. This makes it possible to read large quantities of data files much more quickly on a multi-core system (i.e. most modern laptops).

However, whether parallel processing yields significant improvements in read speeds depends on the number of available processors, file types and operating system. In theory, parallel processing always reduces computation time but in practice this is offset by various factors including the size of the data that needs to be sent back and forth between the processors, file system read/write speed, and the spin-up time for new processes. Generally speaking, parallel processing can provide significant improvements in speed with larger number of files (~10+) and more complex read operations (e.g. continuous flow > dual inlet > scan file). Reading from cache is so efficient that there are rarely gains from parallel processing and it is usually faster NOT to read in parallel once a set of files is already cached.

# read 3 files in parallel (note that this is usually not a large enough file number to be worth it)
di_files <-
  iso_read_dual_inlet(
    iso_get_reader_example("dual_inlet_example.did"),
    iso_get_reader_example("dual_inlet_example.caf"),
    iso_get_reader_example("dual_inlet_nu_example.txt"),
    nu_masses = 49:44,
    parallel = TRUE
  )

Combining / subsetting isofiles

All isoreader objects are lists that can be combined or subset to work with only specific files or create a larger collection.

# all 3 di_files read above
di_files

# only one of the files (by index)
di_files[[2]]

# only one of the files (by file_id)
di_files$dual_inlet_example.did

# a subset of the files (by index)
di_files[c(1,3)]

# a subset of the files (by file_id)
di_files[c("dual_inlet_example.did", "dual_inlet_example.caf")]

# same result using iso_filter_files (more flexible + verbose output)
di_files %>% iso_filter_files(
  file_id %in% c("dual_inlet_example.did", "dual_inlet_example.caf")
)

# recombining subset files
c(
  di_files[3],
  di_files[1]
)

Dealing with file read problems

Isoreader is designed to catch problems during file reading without crashing the read pipeline. It keeps track of all problems encountered along the way to make it easy to see what went wrong and remove erroneous files. Most times, files that were only partly saved because of an interrupted instrument analysis will have errors. If you encounter a file that should have intact data in it but has an error in isoreader, please file a bug report and submit your file at https://github.com/isoverse/isoreader/issues

# read two files, one of which is erroneous
iso_files <-
  iso_read_continuous_flow(
    iso_get_reader_example("continuous_flow_example.dxf"),
    system.file("errdata", "cf_without_data.dxf", package = "isoreader")
  )

# retrieve problem summary
iso_files %>% iso_get_problems_summary() %>% knitr::kable()

# retrieve problem details
iso_files %>% iso_get_problems() %>% knitr::kable()

# filter out erroneous files
iso_files <- iso_files %>% iso_filter_files_with_problems()

Re-reading files

If a file has changed (e.g. is edited through the vendor software) and the changes should be loaded in isoreader, it is easy to re-read and update just those files within a file collection by using the iso_reread_changed_files() function. If some of the files are no longer accessible at their original location, it will throw a warning. If the location for all files has changed, it can be easily adjusted by modifying the file_root file info parameter using iso_set_file_root().

Similar functions can be used to re-read outdated files from an older isoreader version (iso_reread_outdated_files()), attempt to re-read problematic files that had read errors/warnings (iso_reread_problem_files()), or simply re-read all files in a collection (iso_reread_all_files()).

# re-read the 3 dual inlet files from their original location if any have changed
di_files %>%
  iso_reread_changed_files()

# update the file_root for the files before re-read (in this case to a location
# that does not hold these files and hence will lead to a warning)
di_files %>%
  iso_set_file_root(root = ".") %>%
  iso_reread_all_files()

Units

Isoreader provides a built in data type with units (iso_with_units) that can be used to easily keep track of units inside data frame. These units can be made explicit (=included in the column header), stripped altogether, or turned back to be implicit.

# strip all units
cf_file %>%
  iso_get_vendor_data_table(select = c(`Ampl 28`, `rIntensity 28`, `d 15N/14N`)) %>%
  iso_strip_units() %>% head(3)

# make units explicit
cf_file %>%
  iso_get_vendor_data_table(select = c(`Ampl 28`, `rIntensity 28`, `d 15N/14N`)) %>%
  iso_make_units_explicit() %>% head(3)

# introduce new unit columns e.g. in the file info
cf_file %>%
  iso_mutate_file_info(weight = iso_with_units(0.42, "mg")) %>%
  iso_get_vendor_data_table(select = c(`Ampl 28`, `rIntensity 28`, `d 15N/14N`),
                            include_file_info = weight) %>%
  iso_make_units_explicit() %>% head(3)

# or turn a column e.g. with custom format units in the header into implicit units
cf_file %>%
  iso_mutate_file_info(weight.mg = 0.42) %>%
  iso_get_vendor_data_table(select = c(`Ampl 28`, `rIntensity 28`, `d 15N/14N`),
                            include_file_info = weight.mg) %>%
  iso_make_units_implicit(prefix = ".", suffix = "") %>% head(3)

Formatting

Formatting data into text is easily achieved with the built in R function sprintf but this package also provides a convenience function that knows how to incorporate units information from iso_with_units values. Use iso_format to format and concatenate any single values or entire columns inside a data frame.

# concatenation example with single values
iso_format(
   pi = 3.14159,
   x = iso_with_units(42, "mg"),
   ID = "ABC",
   signif = 4,
   sep = " | "
)

# example inside a data frame
cf_file %>%
  iso_get_vendor_data_table(select = c(`Nr.`, `Ampl 28`, `d 15N/14N`)) %>%
  dplyr::select(-file_id) %>%
  head(3) %>%
  # introduce new label columns using iso_format
  dplyr::mutate(
    # default concatenation of values
    label_default = iso_format(
      `Nr.`, `Ampl 28`, `d 15N/14N`,
      sep = ", "
    ),
    # concatenate with custom names for each value
    label_named = iso_format(
      `#` = `Nr.`, A = `Ampl 28`, d15 = `d 15N/14N`,
      sep = ", "
    ),
    # concatenate just the values and increase significant digits
    label_value = iso_format(
      `Nr.`, `Ampl 28`, `d 15N/14N`,
      sep = ", ", format_names = NULL, signif = 6
    )
  )


Try the isoreader package in your browser

Any scripts or data that you put into this service are public.

isoreader documentation built on Nov. 19, 2021, 1:07 a.m.