isopattern: Isotope pattern calculation
In enviPat: Isotope Pattern, Profile and Centroid Calculation for Mass Spectrometry

isopattern

R Documentation

Isotope pattern calculation

Description

The function calculates the isotopologues ("isotope fine structure") of a given chemical formula or a set of chemical formulas (batch calculation) with fast and memory efficient transition tree algorithms, which can handle relative pruning thresholds. Returns accurate masses, probabilities and isotopic compositions of individual isotopologues. The isotopes of elements can be defined by the user.

Usage

isopattern(isotopes, chemforms, threshold = 0.001, charge = FALSE, 
emass = 0.00054857990924, plotit = FALSE, algo=1, rel_to = 0, verbose = TRUE,
return_iso_calc_amount = FALSE)

Arguments

`isotopes`	Dataframe listing all relevant isotopes, such as `isotopes`.
`chemforms`	Vector with character strings of chemical formulas, such as data set `chemforms` or the second column in the value of `check_chemform`.
`threshold`	Probability below which isotope peaks can be omitted, as specified by argument `rel_to`. Set to `0` if all peaks shall be calculated.
`charge`	z in m/z. Either a single integer or a vector of integers with length equal to that of argument `chemforms`. Set to `FALSE` for omitting any charge calculations.
`emass`	Electrone mass; only relevant if `charge` is not set to `FALSE`.
`plotit`	Should results be plotted, `TRUE/FALSE`?
`algo`	Which algorithm to use? Type `1` or `2`. See details.
`rel_to`	Probability definition, numeric `0,1,2,3 or 4`? See details.
`verbose`	Verbose, `TRUE/FALSE`?
`return_iso_calc_amount`	Ignore; number of intermediate isotopologues.

Details

Isotope pattern calculation can be done by chosing one of two algorithms, set by argument algo. Both algorithms use transition tree updates to derive the exact mass and probability of a new isotopologue from existing ones, by steps of single isotope replacements. These transition tree approaches are memory-efficient and fast for a wide range of molecular formulas and are able to reproduce the isotope fine structure of molecules. The latter must often be pruned during calculation, c.p. argument rel_to.

algo=1 grows transition trees within element-wise sub-molecules, whereas algo==2 grows them in larger sub-molecules of two elements, if available. The latter approach can be slightly more efficient for very large or very complex molecules. The sub-isotopologues within sub-molecules are finally combined to the isotopologuees of the full molecule. In contrast, intermediate counts of sub-isotopologues instead of fine structures are returned for return_iso_calc_amount==TRUE

rel_to offers 5 possibilities of how probabilities are defined and pruned, each affecting the threshold argument differently. Default option rel_to=0 prunes and returns probabilities relative to the most intense isotope peak; threshold states a percentage of the intensity of this latter peak. Similarly, option rel_to=1 normalizes relative to the peak consisting of the most abundant isotopes for each element, which is often the monoisotopic one. Option rel_to=2 prunes and returns absolute probabilities ; threshold is not a percentage but an abolute cutoff. Options rel_to=3 and rel_to=4 prune relative to the most intense and "monoisotopic" peak, respectively. Although threshold is a percentage, both options return absolut probabilities .

Value

List with length equal to length of vector chemforms; names of entries in list = chemical formula in chemform. Each entry in that list contains information on individual isotopologues (rows) with columns:

`m/z`	First column; m/z of an isotope peak.
`abundance`	Second column; abundance of an isotope peak. Probabilities are set relative to the most abundant peak of the isotope pattern.
`12C, 13C, 1H, 2H, ...`	Third to all other columns; atom counts of individual isotopes for an isotope peak.

warning

Too low values for threshold may lead to unnecessary calculation of low probable isotope peaks - to the extent that not enough memory is available for either of the two algorithms.

Note

It is highly recommended to check argument chemforms with check_chemform prior to running isopattern; argument chemforms must conform to chemical formulas as defined in check_chemform. Element names must be followed by numbers (atom counts of that element), i.e. C1H4 is a valid argument whereas CH4 is not. Otherwise, numbers may only be used in square brackets to denote individual isotopes defined in the element name column of iso_list, such as [14]C or [18]O. For example, [13]C2C35H67N1O13 is the molecular formula of erythromycin labeled at two C-positions with [13]C; C37H67N1O13 is the molecular formula of the unlabeled compound.

For correct adduct isotope pattern calculations, please check adducts.

Author(s)

Martin Loos, Christian Gerber

References

Loos, M., Gerber, C., Corona, F., Hollender, J., Singer, H. (2015). Accelerated isotope fine structure calculation using pruned transition trees, Analytical Chemistry 87(11), 5738-5744.

https://pubs.acs.org/doi/abs/10.1021/acs.analchem.5b00941

https://www.envipat.eawag.ch/index.php

Examples


############################
# batch of chemforms #######
data(isotopes)
data(chemforms)
pattern<-isopattern(
  isotopes,
  chemforms,
  threshold=0.1,
  plotit=TRUE,
  charge=FALSE,
  emass=0.00054858,
  algo=1
)
############################
# Single chemical formula ##
data(isotopes) 
pattern<-isopattern(
  isotopes,
  "C100H200S2Cl5",
  threshold=0.1,
  plotit=TRUE,
  charge=FALSE,
  emass=0.00054858,
  algo=1
)
############################

enviPat documentation built on Oct. 21, 2022, 5:06 p.m.