knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
# knitr::opts_knit$set(root.dir = "C:/Users/huguenin/Documents/R/provoc")

provoc

DOI R-CMD-check lifecycle CRAN_Status_Badge

Perform a Rapid Overview for Volatile Organic Compounds.

The provoc package has been developed to support PTR-ToF-MS users in their analyses. It has been designed for a quick import of data into R and visualization of the first results in a few minutes. It automatically detects peaks and provides a matrix for further analysis. Some chemometrics functions are proposed.

It is still a young and wild package that will appreciate feedback and new ideas for its development. Do not hesitate to contact the author to get or provide help.

For cite this package :

Installation

The development version can be installed from GitHub using:

# install.packages("remotes")
remotes::install_github("jhuguenin/provoc")

The package requires the update of many dependencies:

Description of function

Principals functions

Secondary functions

Manage your data

Find index or position

Usage

library(provoc)

Importation

Before importing, all h5 files must be placed in a directory named h5. This directory must be placed in the working directory. The name of the h5 files placed in the directory may contain the date and time of recording in the form _yyymmdd_hhmmss. This information will be removed during import. e.g. 00_file_PTR_ToF_MS_20210901_093055.h5 will be renamed 00_file_PTR_ToF_MS by the import.

Each file is an acquisition with several spectra.

# working directory
wd <- "~/R/data_test/miscalenous" # without final "/"
setwd(wd) # If you you don't work by project.

# + wd/
# |  \- h5/
# |     \- 00_file_PTR_ToF_MS.h5
# |     \- 01_file_PTR_ToF_MS.h5
# |     \- 02_file_PTR_ToF_MS.h5

# import
sp <- import.h5(wd) # or just sp <- import.h5() if you work by project. 

The import.h5() function automatically creates a directory named "Figures", a csv file with the meta data "meta_empty.csv" and a list sp. This list contains :

You can control part of the analysis with the "meta_empty.csv" file. It is in the form of a table with all the acquisitions imported in rows. The columns are :

You can prepare several meta files. By activating or not the acquisitions, it is possible to make different analyses. This allows you to do only one import (often long). You can rename the files meta_1, meta_2 or with more explicit names.

If the import is stopped because of a corrupted file, use info.h5() and delete.spectra.h5() to correct this file.

If your h5 files are differents, let me know and I can add an option for you.

Optimize the importation

If you want to improve the data import, there are three options to consider :

Preparation

After the import, a csv file was created in the working directory. This file should look like this:

It is made to be easily opened and closed by excel in Windows10. If your default settings do not allow this facility, let me know and I can add an option for you.

Once opened, you can edit the information inside to fill in different information such as the color or modality of your samples. Here, the example shows the first three cycles of a sequence with four samples, two with modality A, one with modality B and one blank. There are 12 spectra per sample.
With the column acq_T0, I indicate which acquisitions belong to each sample. This is useful for the "time" option when producing the graph.

sp <- import.meta("meta_1") # without '.csv'

Then, to refine my analysis, I decided to superimpose the T0 of each sample to facilitate the comparison. For this, I removed 300 (600 and 900) seconds because each sample is analyzed for 5 minutes. I also removed my sample 2 with the used column. Finally, I selected only the last 6 spectra (out of 12) by modifying the start column.

sp <- import.meta("meta_2")

To be able to switch from one graphical representation to another quickly, I created two files meta_1.csv and meta_2.csv that I import according to my needs.

All operations performed during the analysis are recorded. It is easy to save this trace. Afterwards, you can restart your workflow automatically (not available).

saveRDS(sp$workflow, "workflow.rds")
wf <- readRDS("workflow.rds")

After preparing the meta file, you should recalculate the time with re.calc.T.para() if you need to. The other function reinitialize the time.

sp <- re.calc.T.para(sp)
sp <- re.init.T.para(sp)

Be careful. By default, the "time" option uses a relative T0 from the first spectrum of each acquisition and the "date" option uses the actual date and time of each spectrum. Using the acq_T0 column with the "time" option allows different acquisitions to be sequenced using the T0 of the specified acquisition. Using the acq_T0 column with the "date" option allows you to overlap acquisitions on the T0 of the specified acquisition.

The delta_T column is used to add the specified time (in seconds) to the acquisition.

Make a plot

With the following three functions, it is really easy to make graphs to explore your data.

dy.spectra and fx.spectra allow you to make figures of the spectra, respectively dynamically and fixed. You have to fill sel_sp with a numerical vector indicating the numbers of the spectra to use (sp$Sacq). For fx.spectra, pkm and pkM are the min and max limits.

# a dynamic plot :
dy.spectra(sel_sp = sp$mt$meta[sp$acq,"end"], new_color = FALSE)
# a standart plot :
fx.spectra(sel_sp = sp$mt$meta[sp$acq,"end"], pkm = 137, pkM = 137, leg = "l")
fx.spectra(sel_sp = 1, pkm = 59, pkM = 150)

kinetic.plot plots the evolution of the peaks.

kinetic.plot(M_num = M.Z.max(c(59, 137)), each_mass = TRUE,
                         group = "grp1", graph_type = "dy",
                         Y_exp = FALSE, time_format = "date")

Make a MCR

After performing a univariate analysis, Provoc allows a multivariate analysis using the MCR algorithm. This technique is detailed in the article : Multivariate Curve Resolution (MCR). Solving the mixture analysis problem(2014). Anna de Juan, Joaquim Jaumot and Roma Tauler. https://doi.org/10.1039/C4AY00571F

Currently the constraints of the MCR are by default the same as those of the "alsace" package. They are consistent with an analysis of PTR-ToF-MS data. The function allows to set the number of components used for the RCM (ncMCR) and to specify a variable selection (pk_sel). You can also specify a column from the meta_xxx.csv file that you wish to use to group the spectra according to a modality.

Arguments - ncMCR : (integer) number of componant of MCR. - grp : a character string of the group 's column name. - pk_sel : a vector of selected peaks, or "all". - time_format : a charater string "date" or "time". - Li : the list with spectra (sp).

mcr.result <- mcr.voc(ncMCR = 3, grp = "modality", pk_sel = "all", 
                      time_format = "date", Li = sp)

Others functions

The package includes several small utility functions.
Three of them deserve a clear explanation because they allow you to find the index of peaks or spectra. ind.acq allows you to find the spectra related to an acquisition. For example, if you have taken a series of 60 acquisitions, each with 8 spectra, ind.acq allows you to locate the 8 spectra of the 42nd acquisition [ ind.acq(42) #329 330 331 332 333 334 335 336]. ind.pk works in the same way but on the position of the peaks. This function should not be confused with M.Z (or M.Z.max) which finds the peaks detected during the import.

And now ...

You have everything to work wit provoc. If you have specific needs, questions or remarks, you can contact me quickly at my email address (joris.name [at] cefe.cnrs.fr).

See you



JHuguenin/provoc documentation built on Jan. 29, 2024, 12:39 a.m.