combine: Combining isotope, adduct and homologue series relations in...

View source: R/combine.R

combineR Documentation

Combining isotope, adduct and homologue series relations in HRMS data sets.

Description

Combines the grouping of isotope pattern peaks from pattern.search with that of adducts from adduct.search to components, with information on homologue series relations from homol.search attached. Includes some checks for component plausibility. Needs at least two inputs of (1) isotope pattern relations, (2) adduct relations and/or (3) homologue series relations. Extracts the most intensive peak per component, allowing for a comparison of components among HRMS data sets.

Individual components and peak relations therein can then be plotted with plotcomp. Numbers for detected isotope m/z differences among components can be summarized with plotisotopes. Subsets of components and HRMS data can be interactively selected for with ms.filter.

Usage

combine(pattern, adduct, homol = FALSE, rules = c(TRUE, TRUE, FALSE, FALSE), dont = FALSE)

Arguments

pattern

List of type pattern produced by pattern.search. If not used, set to FALSE.

adduct

List of type adduct produced by adduct.search. If not used, set to FALSE.

homol

List of type homol produed by homol.search. If not used, set to FALSE(default).

rules

Vector with four entries of TRUE or FALSE. See rules section.

dont

Numeric vector with one or several values in between 1 and 4, to exclude components with particular warnings; if not used, set to FALSE. See details.

Details

The algorithm sorts relations among peaks in HRMS data sets generated by pattern.search, adduct.search and homol.search to components in a repetition of four consecutive steps.

  1. In a first step, and along decreasing peak intensities, individual peaks are checked for being part of an isotope pattern group and thus relatable to other peaks.

  2. In a second step, all peaks within this group from the first step are checked for being part of adduct groups, thus relating to more peaks. Step one and two should thereby lead to the full set of peaks defining a component.

  3. In a third step (check rules[1] = TRUE), all peaks in a component are checked for having adduct or isotope pattern relations to other interfering peaks not yet subsumed into the component, e.g., as a result of overlapping isotope pattern groups.

  4. In a fourth step, all peaks found for a component are, if available, related to any homologue series they may be part of.

Once thus assigned to a component, peaks take not further part in subsequent repetitions of step one to initiate a new component (except for interfering peaks, check rules[2] = TRUE). However, they may repeatedly be involved in steps two and three to reflect ambiguities of assigning components.

Four plausibility checks are implemented, represented by warning indices 1 to 4. The first test checks whether the adduct relations found for the peaks assorted under above steps one and two are consistent. If ambiguous adduct relations (e.g. M+H<->M+K AND M+Na<->M+NH4) are found for at least one peak, warning 1 is tagged to the concerned component. The second test checks whether variations in peak intensities within isotope pattern groups are consistent among the different adducts of the same component. This must account for uncertainty in peak intensities via argument inttol of pattern.search. The third check examines whether interfering peaks occur. The fourth check takes effect if a component consists of ambiguously merged isotope pattern groups (only relevant if several charges are used, see use_charges argument in make.isos and the last of the rules in pattern.search). These warning indices can then be used to exclude components affected, using argument dont. For example, dont=c(1,3) excludes components with ambiguous adduct relations and interfering peaks from the final component list.

Value

List of type comp with 7 entries

comp[[1]]

Components. Dataframe with listing of individual components, component IDs and concerned peak IDs and warnings per row. The last columns list m/z, intensity and RT of the most intensive peak in that component.

comp[[2]]

pattern peak list. Entry 1 of list of type pattern produced by pattern.search, i.e. pattern[[1]].

comp[[3]]

adduct peak list. Entry 1 of list of type adduct produced by adduct.search, i.e. adduct[[1]].

comp[[4]]

homologue list. Entry 1 of list of type homol produced by homol.search, i.e. homol[[1]].

comp[[5]]

Peaks in components. Vector of TRUE orFALSE, indicating if a peak in pattern[[1]] or adduct[[1]] is part of one or several component(s).

comp[[6]]

Summary.

comp[[7]]

Parameters.

Rules setting

rules[1]: TRUE = interfering peaks are annotated to a component (see details). Set to FALSE if peak matrix is too crowded, e.g., in the chromatographic dead time.

rules[2]: TRUE = interfering peaks already annotated to a component can also enter step one of the algorithm to check for their component as well (see details).

rules[2]: TRUE = remove single-peaked components.

rules[3]: TRUE = only list components being part of (a) homologue serie(s).

Imbecile

Do not combine adduct pattern groups and/or isotope pattern groups and/or homologue series information from (a) different peak lists or (b) the same peak list differently ordered. Beware of combining rules[1] = FALSE with rules[2] = FALSE: most peaks interfering in components will get lost.

Note

Component IDs are allocated in decreasing peak intensity order of the most intensive peak per component, see section value, comp[[1]]. In contrast, IDs of individual peaks refer to the order in which peaks are provided.

Setting the argument pattern to FALSE skips the first step in the algorithm; adducts group are then only searched for a single peak along decreasing peak intensities. Setting the argument adduct to FALSE skips the second step in the algorithm; no adduct groups are then searched for.

Author(s)

Martin Loos

See Also

pattern.search pattern.search2 adduct.search homol.search plotisotopes plotcomp ms.filter plotisotopes

Examples



######################################################
# (0) Group for isotopologues, adducts & homologues  # 
data(peaklist);
data(adducts);
data(isotopes);
iso<-make.isos(isotopes,
	use_isotopes=c("13C","15N","34S","37Cl","81Br","41K","13C","15N","34S","37Cl","81Br","41K"),
	use_charges=c(1,1,1,1,1,1,2,2,2,2,2,2))
pattern<-pattern.search(
  peaklist,
  iso,
  cutint=10000,
  rttol=c(-0.05,0.05),
  mztol=2,
  mzfrac=0.1,
  ppm=TRUE,
  inttol=0.2,
  rules=c(TRUE,TRUE,TRUE,TRUE,TRUE,TRUE,TRUE,TRUE,TRUE,TRUE,TRUE),
  deter=FALSE,
  entry=50
);
adduct<-adduct.search(
  peaklist,
  adducts,
  rttol=0.05,
  mztol=3,
  ppm=TRUE,
  use_adducts=c("M+K","M+H","M+Na","M+NH4"),
  ion_mode="positive"
);
homol<-homol.search(
	peaklist,
	isotopes,	
	elements=c("C","H","O"),
	use_C=TRUE,
	minmz=5,
	maxmz=120,
	minrt=1,
	maxrt=2,
	ppm=TRUE,
	mztol=3.5,
    rttol=0.5,
	minlength=5,
	mzfilter=FALSE,
	vec_size=3E6,
	spar=.45,
	R2=.98,
	plotit=FALSE
)
##############################################################
# Combine these individual groups to components              #
##############################################################
# (1) Standard setting:                                      #
# Produce a component list, allowing for single-peaked       #
# components and with interfering peaks also listed as indi- #
# vidual components (with inputs pattern, adduct, homol):    #
comp <- combine(
	pattern,
	adduct,
	homol,
	dont = FALSE,
	rules = c(TRUE, TRUE, FALSE, FALSE)
);
comp[[6]];
##############################################################
# (2) Produce a list with those components related to a homo-#
# logue series only (requires inputs pattern,adduct,homol):  #
comp <- combine(
	pattern,
	adduct,
	homol,
	dont = FALSE,
	rules = c(TRUE, TRUE, FALSE, TRUE)
);
comp[[6]];
################################################################
# (3) Extract only components seeming plausible and containing #
# more than one peak per component, without homologue series   #
# information attached (with inputs pattern and adduct):       #
comp <- combine(
	pattern,
	adduct,
	homol=FALSE,
	dont= FALSE,
	rules= c(TRUE, TRUE, TRUE, TRUE)
);
comp[[6]];
##############################################################



blosloos/nontarget documentation built on June 2, 2022, 3:53 p.m.