assign_formulas: Molecular Formula Assignment

View source: R/assign_formulas.R

assign_formulasR Documentation

Molecular Formula Assignment

Description

Assigns molecular formulas to molecular masses using a predefined library. Input of the peaklist (pl) is internally checked as_peaklist(), converted to neutral masses calc_neutral_mass(), and assigned with molecular formulas based on the mass accuracy (ma_dev) provided calc_ma_abs(). The input can be either:

  • A peaklist (data.table) containing m/z values or neutral masses and additional metadata .

  • A numeric vector of m/z values or neutral masses without additional metadata (internally checked and standardized by as_peaklist()).

Usage

assign_formulas(pl, formula_library, verbose = FALSE, ...)

Arguments

pl

Either a peaklist (data.table) with at least columns mz, i_magnitude, and file_id, or a numeric vector of masses. For numeric input, a minimal peaklist is constructed internally.

formula_library

Molecular formula library: a predefined data.table used for assigning molecular formulas to a peak list and for mass calibration. The library requires a fixed format, including mass values for matching. Predefined libraries are available in the R package ume.formulas and further described in Leefmann et al. (2019). A standard library for marine dissolved organic matter is ume.formulas::lib_02. New libraries can be built using ume::create_ume_formula_library().

verbose

logical; if TRUE, show progress messages.

...

Arguments passed on to calc_ma_abs, calc_neutral_mass

m

Measured mass

ma_dev

Mass accuracy in +/- parts per million (ppm)

mz

Numeric vector of m/z values (> 0).

pol

Character: "neg", "pos", or "neutral".

Details

This function calculates the neutral mass of peaks in pl and compares it to mass values in formula_library, assigning molecular formulas based on mass accuracy thresholds. If 13C, 15N, or 34S isotope information is missing, additional columns are added to the output table.

Value

A data.table where each row represents a molecular formula assigned to a mass peak. The table contains:

  • All columns of the input peaklist pl (e.g. mz, i_magnitude, file_id).

  • All columns of the input formula_library (e.g. mf, element counts).

  • Calculated columns:

    • m — neutral mass.

    • m_cal — exact mass of the assigned formula.

    • del — absolute mass error (Da).

    • ppm — mass error in parts per million.

    • mf_id — unique ID for each (file_id, mf) combination.

  • Added isotope columns (⁠13C⁠, ⁠15N⁠, ⁠34S⁠) if missing in the library.

One peak may receive zero, one, or multiple assigned formulas depending on the mass accuracy threshold.

Author(s)

Boris P. Koch

Examples

# Example using demo data
assign_formulas(pl = peaklist_demo,
                formula_library = ume::lib_demo,
                pol = "neg",
                ma_dev = 0.2,
                verbose = FALSE)

ume documentation built on Dec. 13, 2025, 1:06 a.m.