annotate_metabolites_mass_dataset: Annotate Metabolites in a mass_dataset Object

View source: R/12_annotate_metabolites_mass_dataset.R

annotate_metabolites_mass_datasetR Documentation

Annotate Metabolites in a mass_dataset Object

Description

This function performs metabolite annotation for a 'mass_dataset' object based on MS1 and MS2 data. It matches the mass-to-charge ratio (m/z), retention time (RT), and MS2 spectra with a reference database to identify potential metabolites.

Usage

annotate_metabolites_mass_dataset(
  object,
  ms1.match.ppm = 25,
  ms2.match.ppm = 30,
  mz.ppm.thr = 400,
  ms2.match.tol = 0.5,
  fraction.weight = 0.3,
  dp.forward.weight = 0.6,
  dp.reverse.weight = 0.1,
  remove_fragment_intensity_cutoff = 0,
  rt.match.tol = 30,
  polarity = c("positive", "negative"),
  ce = "all",
  column = c("rp", "hilic"),
  ms1.match.weight = 0.25,
  rt.match.weight = 0.25,
  ms2.match.weight = 0.5,
  total.score.tol = 0.5,
  candidate.num = 3,
  database,
  threads = 3
)

Arguments

object

A 'mass_dataset' object that contains MS1 and MS2 data.

ms1.match.ppm

A numeric value specifying the mass accuracy threshold for MS1 matching in parts per million (ppm). Defaults to '25'.

ms2.match.ppm

A numeric value specifying the mass accuracy threshold for MS2 (Fragment ion) matching in ppm. Defaults to '30'.

mz.ppm.thr

A numeric value specifying the m/z threshold in ppm for matching MS1 and MS2. Defaults to '400'.

ms2.match.tol

A numeric value specifying the tolerance for MS2 fragment ion matching. Defaults to '0.5'.

fraction.weight

A numeric value specifying the weight for the MS2 fragmentation score. Defaults to '0.3'.

dp.forward.weight

A numeric value specifying the weight for the forward dot product in MS2 matching. Defaults to '0.6'.

dp.reverse.weight

A numeric value specifying the weight for the reverse dot product in MS2 matching. Defaults to '0.1'.

remove_fragment_intensity_cutoff

A numeric value specifying the intensity cutoff for removing fragments in MS2 matching. Defaults to '0'.

rt.match.tol

A numeric value specifying the retention time matching tolerance in seconds. Defaults to '30'.

polarity

A character string specifying the ionization mode. It can be either '"positive"' or '"negative"'. Defaults to '"positive"'.

ce

A character string specifying the collision energy for MS2 matching. Defaults to '"all"'.

column

A character string specifying the chromatographic column type, either '"rp"' (reverse phase) or '"hilic"'. Defaults to '"rp"'.

ms1.match.weight

A numeric value specifying the weight of MS1 matching in the total score calculation. Defaults to '0.25'.

rt.match.weight

A numeric value specifying the weight of RT matching in the total score calculation. Defaults to '0.25'.

ms2.match.weight

A numeric value specifying the weight of MS2 matching in the total score calculation. Defaults to '0.5'.

total.score.tol

A numeric value specifying the threshold for the total score. Defaults to '0.5'.

candidate.num

A numeric value specifying the number of top candidates to retain per feature. Defaults to '3'.

database

A 'databaseClass' object containing the reference spectral database for annotation.

threads

An integer specifying the number of threads to use for parallel processing. Defaults to '3'.

Details

This function uses both MS1 and MS2 data (if available) to identify metabolites by matching experimental features with a reference spectral database. If no MS2 data is available, only m/z and RT are used for matching. The matching process is controlled by parameters like 'ms1.match.ppm', 'ms2.match.ppm', 'rt.match.tol', and various weighting factors.

The function supports both positive and negative ionization modes and allows for fine-tuning of the matching process with customizable thresholds and weights. The number of top candidates to retain per feature can be controlled with 'candidate.num'.

Value

A 'mass_dataset' object with an updated annotation table containing the metabolite identification results.

Author(s)

Xiaotao Shen xiaotao.shen@outlook.com

Examples

## Not run: 
library(massdataset)
library(magrittr)
library(dplyr)
ms1_data =
  readr::read_csv(file.path(
    system.file("ms1_peak", package = "metid"),
    "ms1.peak.table.csv"
  ))

ms1_data = data.frame(ms1_data, sample1 = 1, sample2 = 2)

expression_data = ms1_data %>%
  dplyr::select(-c(name:rt))

variable_info =
  ms1_data %>%
  dplyr::select(name:rt) %>%
  dplyr::rename(variable_id = name)

sample_info =
  data.frame(
    sample_id = colnames(expression_data),
    injection.order = c(1, 2),
    class = c("Subject", "Subject"),
    group = c("Subject", "Subject")
  )
rownames(expression_data) = variable_info$variable_id

object = create_mass_dataset(
  expression_data = expression_data,
  sample_info = sample_info,
  variable_info = variable_info
)

object

data("snyder_database_rplc0.0.3", package = "metid")

database = snyder_database_rplc0.0.3

object1 =
  annotate_metabolites_mass_dataset(object = object,
                                    database = snyder_database_rplc0.0.3)
head(extract_annotation_table(object1))

## End(Not run)


tidymass/metid documentation built on Sept. 14, 2024, 4:43 p.m.