knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.path = "man/figures/README-", out.width = "100%" )
The goal of simplerspec.opus is to facilitate spectral measurement workflows for Bruker spectrometers. It is tailored to file outputs and directory structures produced by the Bruker OPUS and OPUS Lab software.
You can install the most recent version of simplerspec.opus using
if (!require("remotes")) install.packages("remotes") remotes::install_github("philipp-baumann/simplerspec.opus")
This package assumes that OPUS measurements are organized in folders (one per measurement day).
This test verifies whether both metadata and data are
complete, for each of the "date" folders separately. The binary Bruker OPUS
files, that contain all spectral measurements and aquisition parameters, are
located in folders named by date. The names are unluckily in the format
"YYYY_M_D"
. That does not follow
ISO 8601.
# Load packages pkgs <- c("simplerspec.opus", "here") lapply(pkgs, library, character.only = TRUE)
You can do the test using the function below:
test_data_metadata(data_root = here("data", "spectra", "2018-BDM"))
This test ensures that there is a minimum chance that samples were wrongly labelled or measurement positions were swapped.
Expectation: Metadata are Open Office spreadsheets with extension
.ods
. The file name has a date prefix of format YYYY-MM-DD
.
Test: Test whether every measurement folder has a corresponding
metadata file. This is done by extracting "YYYY_M_D"
from
the metadata file names, and converting to "YYYY-MM-DD"
, and check if
metadata file name information and and folder names are all identical.
Expectation: Every metadata spreadsheet has a date
column with
header named identically, where entries represent dates in format
YYYY-MM-DD
.
Test: Test whether the date
column entries are matching the expanded
measurement folders for each date.
Expectations: The metadata spreadsheets contain a column pos
that
specifies the well plate positions, and also a column sample_id
. The
sample_id
of a sample needs to be unique. The OPUS files that are generated
after measurements have a _<plate-position>
suffix after a sample_id
string.
Test: In order to check that there is a metadata entry for each measured
file, file names are reconstructed from sample_id
and pos
information.
Further, repetition numbers starting from 0 are sequentially numbered by an
integer increasing by 1, are reconstructed as file extension
.<repetition_number>
. This is what is expected to be generated from the
measurements managed by the Bruker OPUS lab interface.
This could be an example test message when data and metadata are not following expectations:
# > test_data_metadata(data_root = here("data", "spectra", "2018-BDM")) # Each measurement folder has a corresponding metadata file. # `date` column entries in the metadata spreadsheet(s) # correspond to the expected measurement folder name (date) in the metadata # file name(s) # Error in struct_data_metadata(data_root = here("data", "spectra", "2018-BDM")) : # Measurment data files and metadata are not complete. Metadata entries for the following measurement data <files are missing: list(date = c("2018-10-02", "2018-10-02", "2018-10-02", "2018-10-02", "2018-10-02", "2018-10-02", "2018-10-02", "2018-10-02", "2018-10-02", "2018-10-02", "2018-10-02", "2018-10-02"), sample_id = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), sample_id_pos = c("NA_A4", "NA_B4", "NA_C4", "NA_D4", "NA_A5", "NA_B5", "NA_C5", "NA_D5", "NA_A6", "NA_B6", "NA_C6", "NA_D6"), plate_no = c(6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6), pos = c("A4", "B4", "C4", "D4", "A5", "B5", "C5", "D5", "A6", "B6", "C6", "D6" # ), meas_folder_exp = c("2018-10-02", "2018-10-02", "2018-10-02", "2018-10-02", "2018-10-02", "2018-10-02", "2018-10-02", "2018-10-02", "2018-10-02", "2018-10-02", "2018-10-02", "2018-10-02"), meas_rep_meta = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), file_id = c("NA_A4.0", "NA_B4.0", "NA_C4.0", "NA_D4.0", "NA_A5.0", "NA_B5.0", "NA_C5.0", "NA_D5.0", "NA_A6.0", "NA_B
In that case, have a look at the message(s), check the metadata spreadsheets, fix accordingly, and finally re-test. If tests succeed, you will receive e.g.
# > test_data_metadata(data_root = here("data", "spectra", "2018-BDM")) # Each measurement folder has a corresponding metadata file. # `date` column entries in the metadata spreadsheet(s) # correspond to the expected measurement folder name (date) in the metadata # file name(s) # Measurement data files and metadata records are complete. # # A tibble: 3,438 x 10 # date sample_id sample_id_pos plate_no pos meas_folder_exp meas_rep_meta file_id # <chr> <chr> <chr> <int> <chr> <chr> <int> <chr> # 1 2018… 30017 KB… 30017 KB017 … 1 B1 2018-09-03 0 30017 … # 2 2018… 30017 KB… 30017 KB017 … 3 B1 2018-09-03 1 30017 … # 3 2018… 30017 KB… 30017 KB017 … 4 B1 2018-09-03 2 30017 … # 4 2018… 30017 KB… 30017 KB017 … 5 B1 2018-09-03 3 30017 … # 5 2018… 30017 KB… 30017 KB017 … 6 B1 2018-09-03 4 30017 … # 6 2018… 30017 KB… 30017 KB017 … 7 B1 2018-09-03 5 30017 … # 7 2018… 30017 KB… 30017 KB017 … 8 B1 2018-09-03 6 30017 … # 8 2018… 30017 KB… 30017 KB017 … 9 B1 2018-09-03 7 30017 … # 9 2018… 30017 KB… 30017 KB017 … 2 D1 2018-09-03 0 30017 … # 10 2018… 30015 KB… 30015 KB015 … 1 C1 2018-09-03 0 30015 … # # ... with 3,428 more rows, and 2 more variables: sample_id_reps <int>, # # date_file <chr>
Henceforth, time is ready to read the spectra ;-)
future::plan(future::multiprocess) # Read OPUS test files files_wizardproj_2019 <- list_opus_file_paths( path = here("data", "spectra", "2019-wizardproj")) spc_lst_wizardproj_2019 <- map(files_wizardproj_2019, ~ future_map(.x, ~ read_opus_univ(fnames = .x, extract = c("spc")), .progress = TRUE) )
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.