mbWorkflow: MassBank record creation workflow

Description Usage Arguments Details Value Author(s) See Also Examples

View source: R/createMassBank.R

Description

Uses data generated by msmsWorkflow to create MassBank records.

Usage

1
2
3
4
5
6
7
mbWorkflow(
  mb,
  steps = c(1, 2, 3, 4, 5, 6, 7, 8),
  infolist_path = "./infolist.csv",
  gatherData = "online",
  filter = TRUE
)

Arguments

mb

The mbWorkspace to work in.

steps

Which steps in the workflow to perform.

infolist_path

A path where to store newly downloaded compound informations, which should then be manually inspected.

gatherData

A variable denoting whether to retrieve information using several online databases gatherData= "online" or to use the local babel installation gatherData= "babel". Note that babel is used either way, if a directory is given in the settings. This setting will be ignored if retrieval is set to "standard"

filter

If TRUE, the peaks will be filtered according to the standard processing workflow in RMassBank - only the best formula for a peak is retained, and only peaks passing multiplicity filtering are retained. If FALSE, it is assumed that the user has already done filtering, and all peaks in the spectrum should be printed in the record (with or without formula.)

Details

See the vignette vignette("RMassBank") for detailed informations about the usage.

Steps:

Step 1: Find which compounds don't have annotation information yet. For these compounds, pull information from several databases (using gatherData).

Step 2: If new compounds were found, then export the infolist.csv and stop the workflow. Otherwise, continue.

Step 3: Take the archive data (in table format) and reformat it to MassBank tree format.

Step 4: Compile the spectra. Using the skeletons from the archive data, create MassBank records per compound and fill them with peak data for each spectrum. Also, assign accession numbers based on scan mode and relative scan no.

Step 5: Convert the internal tree-like representation of the MassBank data into flat-text string arrays (basically, into text-file style, but still in memory)

Step 6: For all OK records, generate a corresponding molfile with the structure of the compound, based on the SMILES entry from the MassBank record. (This molfile is still in memory only, not yet a physical file)

Step 7: If necessary, generate the appropriate subdirectories, and actually write the files to disk.

Step 8: Create the list.tsv in the molfiles folder, which is required by MassBank to attribute substances to their corresponding structure molfiles.

Value

The processed mbWorkspace.

Author(s)

Michael A. Stravs, Eawag <michael.stravs@eawag.ch>

See Also

mbWorkspace-class

Examples

1
2
3
4
5
6
7
## Not run: 
		mb <- newMbWorkspace(w) # w being a msmsWorkspace
		mb <- loadInfolists(mb, "D:/myInfolistPath")
		mb <- mbWorkflow(mb, steps=c(1:3), "newinfos.csv")
		

## End(Not run)

sneumann/RMassBank documentation built on Oct. 20, 2020, 3:19 p.m.