createSTDdbGC: Create an in-house database for GC-MS annotation

View source: R/createSTDdbGC.R

createSTDdbGCR Documentation

Create an in-house database for GC-MS annotation


For creating an in-house instrument-specific annotation database, injections of pure standards need to be processed. All patterns in the vicinity of the retention time of the standard (to be provided by the user) will be compared to an external database - in case of a sufficient match, they will be retained in the database. The generateStdDBGC is not meant to be called directly by the user.


createSTDdbGC(stdInfo, settings, extDB = NULL, manualDB = NULL,
              RIstandards = NULL, nSlaves = 0)
generateStdDBGC(totalXset, settings, extDB = NULL, manualDB = NULL,
                RIstandards = NULL)



Information of the standards, given in the form of a data.frame. Minimal information: stdFile, Name, CAS, monoisotopic mass (monoMW), and retention time (rt). The filenames in slot stdFile should include path information. If this argument is NULL, this function can be used to process a manually curated DB. Arguments stdInfo and manualDB cannot be both NULL.


A list of settings, to be used in peak picking and pattern comparison.


The external database containing spectra, with which to compare the patterns found in the standards files.


A database of manually curated spectra, that will be incorporated in the final DB without any further checks.


A list of xset objects, as generated by peakDetection.


A two-column matrix containing for the standards defining the RI scale both retention times and retention indices. If not given, no RI values will be calculated and retention times will be used instead.


Number of cores to be used in peak picking.


Function createSTDdbGC creates a database object containing validated pseudospectra for a number of compounds. The injections of the standards, described in the input object stdInfo, are processed using function processStandards; comparison with the external database, inclusion of manual compounds and final formatting are done in function generateStdDBGC. Several situations can be envisaged:

A: a series of injections of standards needs to be compared with a standard library, such as the NIST. In this case, both stdInfo and extDB need to be non-null, and the result will be a database in which the entries have a sufficient match with the external DB. If manualDB is also non-null, these entries will be added too (without checking).

B: for a series of injections no standard library information is available (extDB is NULL, and stdInfo is not), and the function simply returns all patterns eluting around the indicated retention time. This allows for subsequent manual validation and pruning. If manualDB is non-null, these entries will be added, but since this is a somewhat unusual thing to do, a warning will be given.

C: a manual database needs to be processed to be useable as a real database. This basically entails renaming the rt and fields into std.rt and, and a similar action for any RI field.


The output of createSTDdbGC (and generateStdDBGC, which is the last function called in createSTDdbGC) is a list, where every entry describes one compound/spectrum combination. For use in annotation, the following fields are mandatory: Name, std.rt, pspectrum and monoMW.


Ron Wehrens

See Also

processStandards, generateStdDBGC


  data(threeStdsNIST)  ## provides object smallDB, excerpt from NIST DB
  ## Not run: 
if (require(metaMSdata)) {
  ## Sitation A: create a DB of standards.
  ## first tell the system where to look
  all.files <- list.files(system.file("extdata", package = "metaMSdata"),
                          pattern = "_GC_", full.names = TRUE)
  stdInfo[,"stdFile"] <- rep(all.files[3], 3)

  data(FEMsettings)    ## provides a.o. TSQXLS.GC, the GC settings file
  data(threeStdsNIST)  ## provides object smallDB, excerpt from NIST DB

  DB <- createSTDdbGC(stdInfo, TSQXLS.GC, extDB = smallDB)
  ## saved in "threeStdsDB.RData" in the data directory of the metaMS
  ## package

  ## Situation B: do not check the data with an external database. Now
  ## the fields bestDBmatch and validation will be absent.
  DB <- createSTDdbGC(stdInfo, TSQXLS.GC, extDB = NULL)

  ## Situation C: create a DB directly from an msp file (manual DB)
  manual.fname <- list.files(system.file("extdata", package = "metaMSdata"),
                             pattern = "msp", full.names = TRUE)
  manual <- read.msp(manual.fname)
  DB <- createSTDdbGC(stdInfo = NULL, settings = TSQXLS.GC,
                      manualDB = manual)

## End(Not run)

rwehrens/metaMS documentation built on Feb. 27, 2023, 5:13 a.m.