README.md

simplerspec.opus

The goal of simplerspec.opus is to facilitate spectral measurement workflows for Bruker spectrometers. It is tailored to file outputs and directory structures produced by the Bruker OPUS and OPUS Lab software.

Installation and prerequisites

You can install the most recent version of simplerspec.opus using

if (!require("remotes")) install.packages("remotes")
remotes::install_github("philipp-baumann/simplerspec.opus")

This package assumes that OPUS measurements are organized in folders (one per measurement day).

Perform folder, file, and data and metadata integrity tests

This test verifies whether both metadata and data are complete, for each of the “date” folders separately. The binary Bruker OPUS files, that contain all spectral measurements and aquisition parameters, are located in folders named by date. The names are unluckily in the format "YYYY_M_D". That does not follow ISO 8601.

# Load packages
pkgs <- c("simplerspec.opus", "here")
lapply(pkgs, library, character.only = TRUE)

You can do the test using the function below:

test_data_metadata(data_root = here("data", "spectra", "2018-BDM"))

This test ensures that there is a minimum chance that samples were wrongly labelled or measurement positions were swapped.

Test category 1:

Test category 2:

Tests category 3: All OPUS files have the correct complete metadata and vice versa

This could be an example test message when data and metadata are not following expectations:

# > test_data_metadata(data_root = here("data", "spectra", "2018-BDM"))
# Each measurement folder has a corresponding metadata file.
# `date` column entries in the metadata spreadsheet(s)
#       correspond to the expected measurement folder name (date) in the metadata
#       file name(s)
# Error in struct_data_metadata(data_root = here("data", "spectra", "2018-BDM")) : 
#   Measurment data files and metadata are not complete. Metadata entries for the following measurement data <files are missing: list(date = c("2018-10-02", "2018-10-02", "2018-10-02", "2018-10-02", "2018-10-02", "2018-10-02", "2018-10-02", "2018-10-02", "2018-10-02", "2018-10-02", "2018-10-02", "2018-10-02"), sample_id = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), sample_id_pos = c("NA_A4", "NA_B4", "NA_C4", "NA_D4", "NA_A5", "NA_B5", "NA_C5", "NA_D5", "NA_A6", "NA_B6", "NA_C6", "NA_D6"), plate_no = c(6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6), pos = c("A4", "B4", "C4", "D4", "A5", "B5", "C5", "D5", "A6", "B6", "C6", "D6"
# ), meas_folder_exp = c("2018-10-02", "2018-10-02", "2018-10-02", "2018-10-02", "2018-10-02", "2018-10-02", "2018-10-02", "2018-10-02", "2018-10-02", "2018-10-02", "2018-10-02", "2018-10-02"), meas_rep_meta = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), file_id = c("NA_A4.0", "NA_B4.0", "NA_C4.0", "NA_D4.0", "NA_A5.0", "NA_B5.0", "NA_C5.0", "NA_D5.0", "NA_A6.0", "NA_B

In that case, have a look at the message(s), check the metadata spreadsheets, fix accordingly, and finally re-test. If tests succeed, you will receive e.g.

# > test_data_metadata(data_root = here("data", "spectra", "2018-BDM"))
# Each measurement folder has a corresponding metadata file.
# `date` column entries in the metadata spreadsheet(s)
#       correspond to the expected measurement folder name (date) in the metadata
#       file name(s)
# Measurement data files and metadata records are complete.
# # A tibble: 3,438 x 10
#    date  sample_id sample_id_pos plate_no pos   meas_folder_exp meas_rep_meta file_id
#    <chr> <chr>     <chr>            <int> <chr> <chr>                   <int> <chr>  
#  1 2018… 30017 KB… 30017 KB017 …        1 B1    2018-09-03                  0 30017 …
#  2 2018… 30017 KB… 30017 KB017 …        3 B1    2018-09-03                  1 30017 …
#  3 2018… 30017 KB… 30017 KB017 …        4 B1    2018-09-03                  2 30017 …
#  4 2018… 30017 KB… 30017 KB017 …        5 B1    2018-09-03                  3 30017 …
#  5 2018… 30017 KB… 30017 KB017 …        6 B1    2018-09-03                  4 30017 …
#  6 2018… 30017 KB… 30017 KB017 …        7 B1    2018-09-03                  5 30017 …
#  7 2018… 30017 KB… 30017 KB017 …        8 B1    2018-09-03                  6 30017 …
#  8 2018… 30017 KB… 30017 KB017 …        9 B1    2018-09-03                  7 30017 …
#  9 2018… 30017 KB… 30017 KB017 …        2 D1    2018-09-03                  0 30017 …
# 10 2018… 30015 KB… 30015 KB015 …        1 C1    2018-09-03                  0 30015 …
# # ... with 3,428 more rows, and 2 more variables: sample_id_reps <int>,
# #   date_file <chr>

Read spectra

Henceforth, time is ready to read the spectra ;-)

future::plan(future::multiprocess)

# Read OPUS test files
files_wizardproj_2019 <-  list_opus_file_paths(
  path = here("data", "spectra", "2019-wizardproj"))

spc_lst_wizardproj_2019 <- map(files_wizardproj_2019, ~ future_map(.x,
  ~ read_opus_univ(fnames = .x,
    extract = c("spc")),
  .progress = TRUE)
)


philipp-baumann/simplerspec.opus documentation built on Aug. 15, 2019, 8:25 a.m.