isopattern: Isotope pattern calculation

View source: R/isopattern.R

isopatternR Documentation

Isotope pattern calculation

Description

The function calculates the isotopologues ("isotope fine structure") of a given chemical formula or a set of chemical formulas (batch calculation) with fast and memory efficient transition tree algorithms, which can handle relative pruning thresholds. Returns accurate masses, probabilities and isotopic compositions of individual isotopologues. The isotopes of elements can be defined by the user.

Usage

isopattern(isotopes, chemforms, threshold = 0.001, charge = FALSE, 
emass = 0.00054857990924, plotit = FALSE, algo=1, rel_to = 0, verbose = TRUE,
return_iso_calc_amount = FALSE)

Arguments

isotopes

Dataframe listing all relevant isotopes, such as isotopes.

chemforms

Vector with character strings of chemical formulas, such as data set chemforms or the second column in the value of check_chemform.

threshold

Probability below which isotope peaks can be omitted, as specified by argument rel_to. Set to 0 if all peaks shall be calculated.

charge

z in m/z. Either a single integer or a vector of integers with length equal to that of argument chemforms. Set to FALSE for omitting any charge calculations.

emass

Electrone mass; only relevant if charge is not set to FALSE.

plotit

Should results be plotted, TRUE/FALSE?

algo

Which algorithm to use? Type 1 or 2. See details.

rel_to

Probability definition, numeric 0,1,2,3 or 4? See details.

verbose

Verbose, TRUE/FALSE?

return_iso_calc_amount

Ignore; number of intermediate isotopologues.

Details

Isotope pattern calculation can be done by chosing one of two algorithms, set by argument algo. Both algorithms use transition tree updates to derive the exact mass and probability of a new isotopologue from existing ones, by steps of single isotope replacements. These transition tree approaches are memory-efficient and fast for a wide range of molecular formulas and are able to reproduce the isotope fine structure of molecules. The latter must often be pruned during calculation, c.p. argument rel_to.

algo=1 grows transition trees within element-wise sub-molecules, whereas algo==2 grows them in larger sub-molecules of two elements, if available. The latter approach can be slightly more efficient for very large or very complex molecules. The sub-isotopologues within sub-molecules are finally combined to the isotopologuees of the full molecule. In contrast, intermediate counts of sub-isotopologues instead of fine structures are returned for return_iso_calc_amount==TRUE

rel_to offers 5 possibilities of how probabilities are defined and pruned, each affecting the threshold argument differently. Default option rel_to=0 prunes and returns probabilities relative to the most intense isotope peak; threshold states a percentage of the intensity of this latter peak. Similarly, option rel_to=1 normalizes relative to the peak consisting of the most abundant isotopes for each element, which is often the monoisotopic one. Option rel_to=2 prunes and returns absolute probabilities ; threshold is not a percentage but an abolute cutoff. Options rel_to=3 and rel_to=4 prune relative to the most intense and "monoisotopic" peak, respectively. Although threshold is a percentage, both options return absolut probabilities .

Value

List with length equal to length of vector chemforms; names of entries in list = chemical formula in chemform. Each entry in that list contains information on individual isotopologues (rows) with columns:

m/z

First column; m/z of an isotope peak.

abundance

Second column; abundance of an isotope peak. Probabilities are set relative to the most abundant peak of the isotope pattern.

12C, 13C, 1H, 2H, ...

Third to all other columns; atom counts of individual isotopes for an isotope peak.

warning

Too low values for threshold may lead to unnecessary calculation of low probable isotope peaks - to the extent that not enough memory is available for either of the two algorithms.

Note

It is highly recommended to check argument chemforms with check_chemform prior to running isopattern; argument chemforms must conform to chemical formulas as defined in check_chemform. Element names must be followed by numbers (atom counts of that element), i.e. C1H4 is a valid argument whereas CH4 is not. Otherwise, numbers may only be used in square brackets to denote individual isotopes defined in the element name column of iso_list, such as [14]C or [18]O. For example, [13]C2C35H67N1O13 is the molecular formula of erythromycin labeled at two C-positions with [13]C; C37H67N1O13 is the molecular formula of the unlabeled compound.

For correct adduct isotope pattern calculations, please check adducts.

Author(s)

Martin Loos, Christian Gerber

References

Loos, M., Gerber, C., Corona, F., Hollender, J., Singer, H. (2015). Accelerated isotope fine structure calculation using pruned transition trees, Analytical Chemistry 87(11), 5738-5744.

https://pubs.acs.org/doi/abs/10.1021/acs.analchem.5b00941

https://www.envipat.eawag.ch/index.php

See Also

isopattern chemforms check_chemform getR envelope vdetect check_several

Examples


############################
# batch of chemforms #######
data(isotopes)
data(chemforms)
pattern<-isopattern(
  isotopes,
  chemforms,
  threshold=0.1,
  plotit=TRUE,
  charge=FALSE,
  emass=0.00054858,
  algo=1
)
############################
# Single chemical formula ##
data(isotopes) 
pattern<-isopattern(
  isotopes,
  "C100H200S2Cl5",
  threshold=0.1,
  plotit=TRUE,
  charge=FALSE,
  emass=0.00054858,
  algo=1
)
############################



enviPat documentation built on Oct. 21, 2022, 5:06 p.m.