In philipp-baumann/simplerspec.opus: Infrared spectroscopy data checking and workflow routines

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-",
  out.width = "100%"
)

simplerspec.opus

The goal of simplerspec.opus is to facilitate spectral measurement workflows for Bruker spectrometers. It is tailored to file outputs and directory structures produced by the Bruker OPUS and OPUS Lab software.

Installation and prerequisites

You can install the most recent version of simplerspec.opus using

if (!require("remotes")) install.packages("remotes")
remotes::install_github("philipp-baumann/simplerspec.opus")

This package assumes that OPUS measurements are organized in folders (one per measurement day).

Perform folder, file, and data and metadata integrity tests

This test verifies whether both metadata and data are complete, for each of the "date" folders separately. The binary Bruker OPUS files, that contain all spectral measurements and aquisition parameters, are located in folders named by date. The names are unluckily in the format "YYYY_M_D". That does not follow
ISO 8601.

# Load packages
pkgs <- c("simplerspec.opus", "here")
lapply(pkgs, library, character.only = TRUE)

You can do the test using the function below:

test_data_metadata(data_root = here("data", "spectra", "2018-BDM"))

This test ensures that there is a minimum chance that samples were wrongly labelled or measurement positions were swapped.

Test category 1:

Expectation: Metadata are Open Office spreadsheets with extension .ods. The file name has a date prefix of format YYYY-MM-DD.
Test: Test whether every measurement folder has a corresponding metadata file. This is done by extracting "YYYY_M_D" from the metadata file names, and converting to "YYYY-MM-DD", and check if metadata file name information and and folder names are all identical.

Test category 2:

Expectation: Every metadata spreadsheet has a date column with header named identically, where entries represent dates in format YYYY-MM-DD.
Test: Test whether the date column entries are matching the expanded measurement folders for each date.

Tests category 3: All OPUS files have the correct complete metadata and vice versa

Expectations: The metadata spreadsheets contain a column pos that specifies the well plate positions, and also a column sample_id. The sample_id of a sample needs to be unique. The OPUS files that are generated after measurements have a _<plate-position> suffix after a sample_id string.
Test: In order to check that there is a metadata entry for each measured file, file names are reconstructed from sample_id and pos information. Further, repetition numbers starting from 0 are sequentially numbered by an integer increasing by 1, are reconstructed as file extension
.<repetition_number>. This is what is expected to be generated from the measurements managed by the Bruker OPUS lab interface.

This could be an example test message when data and metadata are not following expectations:

# > test_data_metadata(data_root = here("data", "spectra", "2018-BDM"))
# Each measurement folder has a corresponding metadata file.
# `date` column entries in the metadata spreadsheet(s)
#       correspond to the expected measurement folder name (date) in the metadata
#       file name(s)
# Error in struct_data_metadata(data_root = here("data", "spectra", "2018-BDM")) : 
#   Measurment data files and metadata are not complete. Metadata entries for the following measurement data <files are missing: list(date = c("2018-10-02", "2018-10-02", "2018-10-02", "2018-10-02", "2018-10-02", "2018-10-02", "2018-10-02", "2018-10-02", "2018-10-02", "2018-10-02", "2018-10-02", "2018-10-02"), sample_id = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), sample_id_pos = c("NA_A4", "NA_B4", "NA_C4", "NA_D4", "NA_A5", "NA_B5", "NA_C5", "NA_D5", "NA_A6", "NA_B6", "NA_C6", "NA_D6"), plate_no = c(6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6), pos = c("A4", "B4", "C4", "D4", "A5", "B5", "C5", "D5", "A6", "B6", "C6", "D6"
# ), meas_folder_exp = c("2018-10-02", "2018-10-02", "2018-10-02", "2018-10-02", "2018-10-02", "2018-10-02", "2018-10-02", "2018-10-02", "2018-10-02", "2018-10-02", "2018-10-02", "2018-10-02"), meas_rep_meta = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), file_id = c("NA_A4.0", "NA_B4.0", "NA_C4.0", "NA_D4.0", "NA_A5.0", "NA_B5.0", "NA_C5.0", "NA_D5.0", "NA_A6.0", "NA_B

In that case, have a look at the message(s), check the metadata spreadsheets, fix accordingly, and finally re-test. If tests succeed, you will receive e.g.

# > test_data_metadata(data_root = here("data", "spectra", "2018-BDM"))
# Each measurement folder has a corresponding metadata file.
# `date` column entries in the metadata spreadsheet(s)
#       correspond to the expected measurement folder name (date) in the metadata
#       file name(s)
# Measurement data files and metadata records are complete.
# # A tibble: 3,438 x 10
#    date  sample_id sample_id_pos plate_no pos   meas_folder_exp meas_rep_meta file_id
#    <chr> <chr>     <chr>            <int> <chr> <chr>                   <int> <chr>  
#  1 2018… 30017 KB… 30017 KB017 …        1 B1    2018-09-03                  0 30017 …
#  2 2018… 30017 KB… 30017 KB017 …        3 B1    2018-09-03                  1 30017 …
#  3 2018… 30017 KB… 30017 KB017 …        4 B1    2018-09-03                  2 30017 …
#  4 2018… 30017 KB… 30017 KB017 …        5 B1    2018-09-03                  3 30017 …
#  5 2018… 30017 KB… 30017 KB017 …        6 B1    2018-09-03                  4 30017 …
#  6 2018… 30017 KB… 30017 KB017 …        7 B1    2018-09-03                  5 30017 …
#  7 2018… 30017 KB… 30017 KB017 …        8 B1    2018-09-03                  6 30017 …
#  8 2018… 30017 KB… 30017 KB017 …        9 B1    2018-09-03                  7 30017 …
#  9 2018… 30017 KB… 30017 KB017 …        2 D1    2018-09-03                  0 30017 …
# 10 2018… 30015 KB… 30015 KB015 …        1 C1    2018-09-03                  0 30015 …
# # ... with 3,428 more rows, and 2 more variables: sample_id_reps <int>,
# #   date_file <chr>

Read spectra

Henceforth, time is ready to read the spectra ;-)

future::plan(future::multiprocess)

# Read OPUS test files
files_wizardproj_2019 <-  list_opus_file_paths(
  path = here("data", "spectra", "2019-wizardproj"))

spc_lst_wizardproj_2019 <- map(files_wizardproj_2019, ~ future_map(.x,
  ~ read_opus_univ(fnames = .x,
    extract = c("spc")),
  .progress = TRUE)
)

philipp-baumann/simplerspec.opus documentation built on Aug. 15, 2019, 8:25 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

philipp-baumann/simplerspec.opus
Infrared spectroscopy data checking and workflow routines

In philipp-baumann/simplerspec.opus: Infrared spectroscopy data checking and workflow routines

simplerspec.opus

Installation and prerequisites

Perform folder, file, and data and metadata integrity tests

Test category 1:

Test category 2:

Tests category 3: All OPUS files have the correct complete metadata and vice versa

Read spectra

R Package Documentation

Browse R Packages

We want your feedback!

philipp-baumann/simplerspec.opus Infrared spectroscopy data checking and workflow routines

In philipp-baumann/simplerspec.opus: Infrared spectroscopy data checking and workflow routines

simplerspec.opus

Installation and prerequisites

Perform folder, file, and data and metadata integrity tests

Test category 1:

Test category 2:

Tests category 3: All OPUS files have the correct complete metadata and vice versa

Read spectra

R Package Documentation

Browse R Packages

We want your feedback!

philipp-baumann/simplerspec.opus
Infrared spectroscopy data checking and workflow routines