labelRows: Annotate and Remove Report Rows

View source: R/labelRows.R

labelRowsR Documentation

Annotate and Remove Report Rows


This is a method for annotating removable, conflicting, and identity-matched feature pair alignment (FPA) rows in the combinedTable report. Simple thresholds for score, rank, retention time error and delta score can computationally reduce the set of possible FPAs to the most likely feature matches. FPAs falling within some small delta score or mz/rt of the top-ranked pair are organized into subgroups to facilitate inspection. Automated reduction to 1-1 pairs is also possible with this function.

reduceTable behaves identically to labelRows, but with a focus on automated table reduction. Rank threshold defaults in reduceTable are also stricter than in labelRows.


  useID = FALSE,
  minScore = 0.5,
  maxRankX = 3,
  maxRankY = 3,
  delta = 0.1,
  method = c("score", "mzrt"),
  maxRTerr = 10,
  resolveConflicts = FALSE,
  rtOrder = TRUE,
  remove = FALSE,
  balanced = TRUE,
  brackets_ignore = c("(", "[", "{")

  useID = FALSE,
  maxRankX = 2,
  maxRankY = 2,
  minScore = 0.5,
  delta = 0.1,
  method = c("score", "mzrt"),
  maxRTerr = 10,
  rtOrder = TRUE,
  brackets_ignore = c("(", "[", "{")



Either a metabCombiner object or combinedTable


option to annotate identity-matched strings as "IDENTITY"


numeric minimum allowable score (between 0 & 1) for metabolomics feature pair alignments


integer maximum allowable rank for X dataset features.


integer maximum allowable rank for Y dataset features.


numeric score or mz/rt distances used to define subgroups. If method = "score", a value (between 0 & 1) score difference between a pair of conflicting FPAs. If method = "mzrt", a length 4 numeric: (m/z, rt, m/z, rt) tolerances, the first pair for X dataset features and the second pair for Y dataset features.


Conflict detection method. If equal to "score" (default), assigns a conflict subgroup if score of lower-ranking FPA is within some tolerance of higher-ranking FPA. If set to "mzrt", assigns a conflicting subgroup if within a small m/z & rt distance of the top-ranked FPA.


numeric maximum allowable error between model-projected retention time (rtProj) and observed retention time (rty)


logical option to computationally resolve conflicting rows to a final set of 1-1 feature pair alignments


logical. If resolveConflicts set to TRUE, then this imposes retention order consistency on rows deemed "RESOLVED" within subgroups.


Logical. Option to keep or discard rows deemed removable.


Logical. Optional processing of "balanced" groups, defined as groups with an equal number of features from input datasets where all features have a 1-1 match.


character. If useID = TRUE, bracketed identity strings of the types in this argument will be ignored


metabCombiner initially reports all possible feature pairings in the rows of the combinedTable report. Most of these are misalignments that require removal. This function is used to automate this reduction process by labeling rows as removable or conflicting, based on certain conditions, and is performed after computing similarity scores.

A label may take on one of four values:

a) "": No determination made b) "IDENTITY": an alignment with matching identity "idx & idy" strings c) "REMOVE": a row determined to be a misalignment d) "CONFLICT": competing alignments for one or multiple shared features

The labeling rules are as follows:

1) Groups determined to be 'balanced': label rows with rankX > 1 & rankY > 1 "REMOVE" irrespective of delta criteria 2) Rows with a score < minScore: label "REMOVE" 3) Rows with rankX > maxRankX and/or rankY > maxRankY: label "REMOVE" 4) Conflicting subgroup assignment as determined by method & delta arguments. Conflicting alignments following outside delta thresholds: labeled "REMOVE". Otherwise, they are assigned a "CONFLICT" label and subgroup number. 5) If useID argument set to TRUE, rows with matching idx & idy strings are labeled "IDENTITY". These rows are not changed to "REMOVE" or "CONFLICT" irrespective of subsequent criteria.


updated combinedTable or metabCombiner object. The table will have three new columns:


characterization of feature alignments as described


conflicting subgroup number of feature alignments


alternate subgroup for rows in multiple feature pair conflicts


#required steps prior to function use
p30 <- metabData(plasma30, samples = "CHEAR")
p20 <- metabData(plasma20, samples = "Red", rtmax = 17.25)
p.comb <- metabCombiner(xdata = p30, ydata = p20, binGap = 0.0075)
p.comb <- selectAnchors(p.comb, tolmz = 0.003, tolQ = 0.3, windy = 0.02)
p.comb <- fit_gam(p.comb, k = 20, iterFilter = 1)
p.comb <- calcScores(p.comb, A = 90, B = 14, C = 0.5)

##applies labels, but maintains all rows
p.comb <- labelRows(p.comb, maxRankX = 2, maxRankY = 2, maxRTerr = 0.5,
                    delta = 0.1, resolveConflicts = FALSE, remove = FALSE)

##automatically resolve conflicts and filter to 1-1 feature pairs
p.comb.2 <- labelRows(p.comb, resolveConflicts = FALSE, remove = FALSE)

#this is identical to the previous command
p.comb.2 <- reduceTable(p.comb)

p.comb <- labelRows(p.comb, method = "mzrt", delta = c(0.005, 0.5, 0.005,0.3))

##this function may be applied to combinedTable inputs as well
cTable <-, featData(p.comb))

lTable <- labelRows(cTable, maxRankX = 3, maxRankY = 2, minScore = 0.5,
         method = "score", maxRTerr = 0.5, delta = 0.2)

hhabra/Combiner documentation built on June 5, 2024, 10:56 a.m.