PreprocessMassSpectra: Pre-process mass spectra

View source: R/mass_spectra_preprocessing.R

PreprocessMassSpectraR Documentation

Pre-process mass spectra

Description

Pre-process mass spectra. Pre-processing includes rounding/binning, sorting, and normalization.

Usage

PreprocessMassSpectra(
  msp_objs,
  bin_boundary = 0.649,
  remove_zeros = TRUE,
  max_intst = 999
)

Arguments

msp_objs

A list of nested lists. Each nested list is a mass spectrum. Each nested list must contain at least three elements: (1) the name element (a string) - compound name (or short description); (2) the mz element (a numeric/integer vector) - m/z values of mass spectral peaks; (3) the intst (a numeric/integer vector) - intensities of mass spectral peaks.

bin_boundary

A numeric value. The position of a bin boundary (it can be considered as a 'rounding point'). The bin_boundary argument must be in the following range: 0.01 <= bin_boundary <= 0.99. The default value is 0.649. This value is used in the AMDIS software and it is close to the optimal rounding rule proposed in our research (Khrisanfov, M.; Samokhin, A. A General Procedure for Rounding m/z Values in Low‐resolution Mass Spectra. Rapid Comm Mass Spectrometry 2022, 36 (11), e9294. https://doi.org/10.1002/rcm.9294).

remove_zeros

An integer value. If TRUE, all m/z values with zero intensity are excluded from mass spectra. It should be taken into account that all zero-intensity peaks presented in a mass spectrum are considered as 'trace peaks' in the case of MS Search software. As a result, the presence/absence of such peaks can influence the value of the match factor.

max_intst

A numeric value. The maximum intensity (i.e., intensity of the base peak) after normalization. The default value is 999 because it is used in some electron ionization mass spectral databases including NIST.

Details

Pre-processing includes the following steps:

  • Calculating a nominal mass spectrum. All floating point m/z values are rounded to the nearest integer using the value of the bin_boundary argument. Intensities of peaks with identical m/z values are summed.

  • Intensities of mass spectral peaks are normalized to max_intst.

  • Intensities of mass spectral peaks are rounded to the nearest integer.

  • If the remove_zeros argument is TRUE, all zero-intensity peaks are removed from the mass spectrum.

  • The preprocessed attribute is added and set to TRUE for the respective mass spectrum.

Value

A list of nested lists. Each nested list is a mass spectrum. Only the mz and intst elements of each nested list are modified during the pre-processing step.

Examples

# Original mass spectra of chlorine and methane
msp_objs <- list(
  list(name = "Chlorine",
       mz = c(34.96885, 36.96590, 69.93771, 71.93476, 73.93181),
       intst = c(0.83 * c(100, 32), c(100, 63.99, 10.24))),
  list(name = "Methane",
       mz = c(10, 11, 12, 13, 14, 15, 16, 17, 18, 19),
       intst = c(0, 0, 25, 75, 155, 830, 999, 10, 0, 0))
)
matrix(c(msp_objs[[1]]$mz, msp_objs[[1]]$intst), ncol = 2) # Chlorine
matrix(c(msp_objs[[2]]$mz, msp_objs[[2]]$intst), ncol = 2) # Methane

# Pre-processed mass spectra of chlorine and methane
pp_msp_objs <- PreprocessMassSpectra(msp_objs, remove_zeros = TRUE)
matrix(c(pp_msp_objs[[1]]$mz, pp_msp_objs[[1]]$intst), ncol = 2) # Chlorine
matrix(c(pp_msp_objs[[2]]$mz, pp_msp_objs[[2]]$intst), ncol = 2) # Methane


mssearchr documentation built on April 3, 2025, 8:28 p.m.