Description Usage Arguments Details Value Warning Note Author(s) See Also Examples
Dynamic programming algorithm for unsupervised detection of homologue series in LC-(HR)MS data.
1 2 3 |
peaklist |
Dataframe of picked LC-MS peaks with three numeric columns for (a) m/z, (b) intensity and (c) retention time, such as |
isotopes |
Dataframe |
elements |
FALSE or chemical elements in the changing units of the homologue series, e.g. c("C","H") for alkane chains. Used to restrict search. |
use_C |
For |
minmz |
Defines the lower limit of the m/z window to search homologue series peaks, relative to the m/z of the one peak to search from. Absolute m/z value [u]. |
maxmz |
Defines the upper limit of the m/z window to search homologue series peaks, relative to the m/z of the one peak to search from. Absolute m/z value [u]. |
minrt |
Defines the lower limit of the retention time (RT) window to look for other homologue peaks, relative to the RT of the one peak to search from, i.e., RT+minrt. For decreasing RT with increasing HS mass, use negative values of minrt. |
maxrt |
Defines the upper limit of the RT window to look for other homologue peaks, relative to the RT of the one peak to search from, i.e., RT+maxrt. See |
ppm |
Should |
mztol |
m/z tolerance setting: +/- value by which the m/z of a peak may vary from its expected value. If parameter |
rttol |
Retention time (RT) tolerance by which the RT between two adjacent pairs of a homologue series is allowed to differ. Units as given in column 3 of peaklist argument, e.g. [min]. |
minlength |
Minimum number of peaks in a homologue series. |
mzfilter |
Vector of numerics to filter for homologue series with specific m/z differences of their repeating units, given the tolerances in |
vec_size |
Vector size. Ignore unless a relevant error message is printed (then try to increase size). |
mat_size |
Matrix size for recombining, multiple of input tuples. Ignore unless a relevant error message is printed (then try to increase size). |
R2 |
FALSE or 0<numeric<=1. Coefficient of determination for cubic smoothing spline fits of m/z versus retention time; homologue series with lower R2 are rejected. See |
spar |
Smoothing parameter, typically (but not necessarily) in (0,1]. See |
plotit |
Logical FALSE or 0<integer<5. Intermediate plots of nearest neigbour paths, spline fits of individual homologues series >= |
deb |
Debug returns, ignore. |
A dynamic programming approach is used to extract series of peaks that differ in constant m/z units and smooth changes in their retention time within bounds of mass defect changes. First, a nearest neighbour path through a kd-tree representation of the data is used to extract all feasible peak triplets. These triplets are then combined to all plausible n-tupels in n-3 steps. At each such step, each newly formed n-tupel is checked for smooth changes of RT with increasing m/z of the homologues, using cubic splines and a R2-based threshold of the model fit.
List of type homol with 6 entries
homol[[1]] |
|
homol[[2]] |
|
homol[[3]] |
|
homol[[4]] |
|
homol[[5]] |
|
homol[[6]] |
Ignore. List with superjacent HS IDs per group - for set |
The rttol
argument of homol.search
must not be mixed with that of pattern.search
or pattern.search2
.
Arguments isotopes
and elements
are needed to limit intermediate numbers of m/z differences to screen over, based on feasible changes in mass defect.
Similarly, intermediate numbers are also limited by the retention time and m/z windows defined by minmz/maxmz
and minrt/maxrt/rttol
, respectively.
The latter are always set relative to the individual RT and m/z values of the peaks to be searched from.
Overall, these parameters must be chosen carefully to avoid a combinatorial explosion of triplet m/z differences, leading to slow computation, memory problems or senseless results.
Values for spar
and R2
have to be adjusted for different chromatographic settings; the smoothing spline fits are used to eliminate homologue series candidates with erratic RT-behaviour.
Spline fits at >=minlength
can be viewed by plotit=2
.
Peak IDs refer to the order in which peaks are provided. Different IDs exist for adduct groups, isotope pattern groups, grouped homologue series (HS) peaks
and homologue series cluster. Yet other IDs exist for the individual components (see note section of combine
).
Here, IDs of homologue series group are given both in the function output homol[[1]]
, homol[[3]]
and homol[[6]]
, with one homologue series stating one group of interrelated peaks.
Martin Loos
rm.sat
isotopes
peaklist
plothomol
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.