generateCompoundsSIRIUS: Compound annotation with SIRIUS

generateCompoundsSIRIUSR Documentation

Compound annotation with SIRIUS

Description

Uses SIRIUS in combination with CSI:FingerID for compound annotation.

Usage

generateCompoundsSIRIUS(fGroups, ...)

## S4 method for signature 'featureGroups'
generateCompoundsSIRIUS(
  fGroups,
  MSPeakLists,
  relMzDev = 5,
  adduct = NULL,
  projectPath = NULL,
  elements = "CHNOP",
  profile = "qtof",
  formulaDatabase = NULL,
  fingerIDDatabase = "pubchem",
  noise = NULL,
  cores = NULL,
  topMost = 100,
  topMostFormulas = 5,
  login = "check",
  alwaysLogin = FALSE,
  extraOptsGeneral = NULL,
  extraOptsFormula = NULL,
  verbose = TRUE,
  splitBatches = FALSE,
  dryRun = FALSE
)

## S4 method for signature 'featureGroupsSet'
generateCompoundsSIRIUS(
  fGroups,
  MSPeakLists,
  relMzDev = 5,
  adduct = NULL,
  projectPath = NULL,
  ...,
  setThreshold = 0,
  setThresholdAnn = 0,
  setAvgSpecificScores = FALSE
)

Arguments

fGroups

featureGroups object which should be annotated. This should be the same or a subset of the object that was used to create the specified MSPeakLists. In the case of a subset only the remaining feature groups in the subset are considered.

... \setsWF

Further arguments passed to the non-sets workflow method.

MSPeakLists

A MSPeakLists object that was generated for the supplied fGroups.

relMzDev

Maximum relative deviation between the measured and candidate formula m/z values (in ppm). Sets the --ppm-max command line option.

adduct

An adduct object (or something that can be converted to it with as.adduct). Examples: "[M-H]-", "[M+Na]+". If the featureGroups object has adduct annotations then these are used if adducts=NULL.

\setsWF

The adduct argument is not supported for sets workflows, since the adduct annotations will then always be used.

projectPath, dryRun

These are mainly for internal purposes. projectPath sets the output directory for the SIRIUS output (a temporary directory if NULL). If dryRun is TRUE then no computations are done and only the results from projectPath are processed.

\setsWF

projectPath should be a character specifying the paths for each set.

elements

Elements to be considered for formulae calculation. This will heavily affects the number of candidates! Always try to work with a minimal set by excluding elements you don't expect. The minimum/maximum number of elements can also be specified, for example: a value of "C[5]H[10-15]O" will only consider formulae with up to five carbon atoms, between ten and fifteen hydrogen atoms and any amount of oxygen atoms. Sets the --elements command line option.

profile

Name of the configuration profile, for example: "qtof", "orbitrap", "fticr". Sets the --profile commandline option.

formulaDatabase

If not NULL, use a database for retrieval of formula candidates. Possible values are: "pubchem", "bio", "kegg", "hmdb". Sets the --database commandline option.

fingerIDDatabase

Database specifically used for CSI:FingerID. If NULL, the value of the formulaDatabase parameter will be used or "pubchem" when that is also NULL. Sets the --fingerid-db option.

noise

Median intensity of the noise (NULL ignores this parameter). Sets the --noise commandline option.

cores

The number of cores SIRIUS will use. If NULL then the default of all cores will be used.

topMost

Only keep this number of candidates (per feature group) with highest score. Set to NULL to always keep all candidates, however, please note that this may result in significant usage of CPU/RAM resources for large numbers of candidates.

topMostFormulas

Do not return more than this number of candidate formulae. Note that only compounds for these formulae will be searched. Sets the --candidates commandline option.

login, alwaysLogin

Specifies if and how account logging of SIRIUS should be handled:

login=FALSE: no automatic login is performed and the active login status is not checked.

login="check": aborts if no active login is present.

login="interactive": interactively ask for login (using getPass).

login=c(username="...", password="..."): perform the login with the given details. For security reasons, please do not enter the details directly, but use e.g. environment variables or store/retrieve them with the keyring package.

if alwaysLogin=TRUE then a login is always performed, otherwise only if SIRIUS reports no active login.

See the SIRIUS website and patRoon handbook for more information.

extraOptsGeneral, extraOptsFormula

a character vector with any extra commandline parameters for SIRIUS. For SIRIUS versions <4.4 there is no distinction between general and formula options. Otherwise commandline options specified in extraOptsGeneral are added prior to the formula command, while options specified in extraOptsFormula are added in afterwards. See the SIRIUS manual for more details. Set to NULL to ignore.

verbose

If TRUE then more output is shown in the terminal.

splitBatches

If TRUE then the calculations done by SIRIUS will be evenly split over multiple SIRIUS calls (which may be run in parallel depending on the set package options). If splitBatches=FALSE then all feature calculations are performed from a single SIRIUS execution, which is often the fastest if calculations are performed on a single computer.

setThreshold \setsWF

Minimum abundance for a candidate among all sets (‘⁠0-1⁠’). For instance, a value of ‘⁠1⁠’ means that the candidate needs to be present in all the set data.

setThresholdAnn \setsWF

As setThreshold, but only taking into account the set data that contain annotations for the feature group of the candidate.

setAvgSpecificScores \setsWF

If TRUE then set specific scorings (e.g. MS/MS match) are also averaged.

Details

This function uses SIRIUS to generate compound candidates. This function is called when calling generateCompounds with algorithm="sirius".

Similar to generateFormulasSIRIUS, candidate formulae are generated with SIRIUS. These results are then fed to CSI:FingerID to acquire candidate structures. Candidate formulae without any assigned structure will be removed (unlike generateFormulasSIRIUS). This method requires the availability of MS/MS data, and feature groups without it will be ignored.

Value

A compoundsSIRIUS object.

Parallelization

generateCompoundsSIRIUS uses multiprocessing to parallelize computations. Please see the parallelization section in the handbook for more details and patRoon options for configuration options.

Note

For annotations performed with SIRIUS it is often the fastest to keep the default splitBatches=FALSE. In this case, all SIRIUS output will be printed to the terminal (unless verbose=FALSE or patRoon.MP.method="future"). Furthermore, please note that only annotations to be performed for the same adduct are grouped in a single batch execution.

References

\insertRef

Dhrkop2019patRoon

\insertRefDuhrkop2015patRoon

\insertRefDuhrkop2015-2patRoon

\insertRefBcker2008patRoon

See Also

generateCompounds for more details and other algorithms.


rickhelmus/patRoon documentation built on Nov. 22, 2024, 3:11 p.m.