calcScores: Compute Feature Similarity Scores
In hhabra/Combiner: Method for Combining LC-MS Metabolomics Feature Measurements

calcScores

R Documentation

Compute Feature Similarity Scores

Description

Calculates a pairwise similarity (between 0 & 1) between all grouped features in metabCombiner object. The similarity score calculation is described in scorePairs.

Usage

calcScores(
  object,
  A = 75,
  B = 10,
  C = 0.25,
  groups = NULL,
  fit = c("gam", "loess"),
  mzshift = FALSE,
  mzfit = mzfitParam(),
  useAdduct = FALSE,
  adduct = 1.25,
  usePPM = FALSE,
  brackets_ignore = c("(", "[", "{")
)

Arguments

`object`	metabCombiner object.
`A`	Numeric weight for penalizing m/z differences.
`B`	Numeric weight for penalizing differences between fitted & observed retention times
`C`	Numeric weight for differences in Q (abundance quantiles).
`groups`	integer. Vector of feature groups to score. If set to NULL (default), will compute scores for all feature groups.
`fit`	Character. Choice of fitted rt model, "gam" or "loess."
`mzshift`	Logical. If TRUE, shifts the m/z values (mzx) before scoring.
`mzfit`	List of parameters for shifting m/z values; see ?mzfitParam
`useAdduct`	logical. Option to penalize mismatches in (non-empty, non-bracketed) adduct column annotations.
`adduct`	numeric. If useAdduct is TRUE, divides score of mismatched, non-empty and non-bracked adduct column labels by this value.
`usePPM`	logical. Option to use relative (as opposed to absolute) m/z differences in score computations.
`brackets_ignore`	If useAdduct = TRUE, bracketed adduct character strings of these types will be ignored according to this argument

Details

This function updates the rtProj, score, rankX, and rankY columns in the combinedTable report. First, using the RT mapping model computed in the previous steps, rtx values are projected onto rty. Then similarity scores are calculated based on m/z, rt (rtProj vs rty), and Q differences, with multiplicative weight penalties A, B, and C.

If the datasets contain representative set of shared identities (idx = idy), evaluateParams provides some guidance on appropriate A, B, and C values to use. In testing, the best values for A should lie between 50 and 120, according to mass accuracy; if using ppm (usePPM = TRUE), the suggested range is between 0.01 and 0.05. B should be between 5 and 15 depending on fitting accuracy (higher if datasets processed under roughly identical conditions) ; C should vary between 0 and 1, depending on sample similarity. See examples below.

Some input datasets exhibit systematic m/z shifts

If using adduct information (useAdduct = TRUE), the score is divided by the numeric adduct argument if non-empty and non-bracketed adduct values do not match. Be sure that adduct annotations are accurate before using this functionality.

Value

metabCombiner object with updated combinedTable. rtProj column will contain fitted retention times determined from previously computed model; score will contain computed pairwise similarity scores of feature pairs; rankX & rankY are the integer ranks of scores for x & y features in descending order.

Examples


data(plasma30)
data(plasma20)

p30 <- metabData(plasma30, samples = "CHEAR")
p20 <- metabData(plasma20, samples = "Red", rtmax = 17.25)
p.comb <- metabCombiner(xdata = p30, ydata = p20, binGap = 0.0075)

p.comb <- selectAnchors(p.comb, tolmz = 0.003, tolQ = 0.3, windy = 0.02)
p.comb <- fit_gam(p.comb, k = 20, iterFilter = 1, family = "gaussian")

#example: moderate m/z deviation, accurate rt fit, high sample similarity
p.comb <- calcScores(p.comb, A = 90, B = 14, C = 0.8, useAdduct = FALSE,
         groups = NULL, fit = "gam", usePPM = FALSE)
cTable <- combinedTable(p.comb)  #to view results


#example 2: high m/z deviation, moderate rt fit, low sample similarity
p.comb <- calcScores(p.comb, A = 50, B = 8, C = 0.2)

#example 3: low m/z deviation, poor rt fit, moderate sample similarity
p.comb <- calcScores(p.comb, A = 120, B = 5, C = 0.5)

#example 4: using ppm for mass deviation; note different A value
p.comb <- calcScores(p.comb, A = 0.05, B = 14, C = 0.5, usePPM = TRUE)

#example 5: limiting to specific m/z groups 1-1000
p.comb <- calcScores(p.comb, A = 90, B = 14, C = 0.5, groups = seq(1,1000))

#example 6: using adduct information
p.comb <- calcScores(p.comb, A = 90, B = 14, C = 0.5, useAdduct = TRUE,
                     adduct = 1.25)

hhabra/Combiner documentation built on June 5, 2024, 10:56 a.m.