find_bench_peaks: find_bench_peaks

View source: R/LazyPeakIntegration_peaks.R

find_bench_peaksR Documentation

find_bench_peaks

Description

Takes a the output of get_ROIs and detects and filters peak candidates.

Usage

find_bench_peaks(
  files,
  Grps,
  CompCol_all,
  Min.PointsperPeak = 10,
  peak.spotting.factor = 0.001,
  Integration_baseL_factor = 0.1,
  plan = "multiprocess",
  Min.cor.w.main_adduct = 0.8,
  Min.cor.w.M0 = 0.85,
  Min.iso.count = 2,
  remove_isoab_outliers = TRUE,
  return_unsuc_searches = FALSE,
  max.rt.diff_sec = 20,
  max.mz.diff_ppm = 5,
  max_bias_area = 35,
  max_bias_height = 30,
  area_height_bias_diff = 30
)

Arguments

files

vector with file paths

Grps

data frame with two columns: one for filen ames without .mzML (sample_name) and one for their respective sample group affiliations (sample_group).

CompCol_all

output from function get_ROIs

Min.PointsperPeak

minimum number of points per peak for a peak to be considered

peak.spotting.factor

this parameter is ignored when user.rtmin/user.rtmax are given in the CompCol_all table. Relative height to the highest point of the EIC above which points should be considered during peak detection process. e.g. 0.001 corresponds to 0.1% of the maximum.

Integration_baseL_factor

relative peak height factor upon which points should be considered to be part of the peak. 0.1 would correspond to 10% of the peak maximum.

plan

see plan

Min.cor.w.main_adduct

Minimum pearson correlation coefficient between main_adduct and other adducts for other adducts to be retained

Min.cor.w.M0

Minimum pearson correlation coefficient between highest isotopologues and lower isotopologues for lower isotopologues to be retained.

Min.iso.count

Minimum number of isotopotologues per compound to be kept in the final output. Has to be more than one.

remove_isoab_outliers

Should isotopologues be removed if they differ from predicted values by more than 35% (TRUE/FALSE)

return_unsuc_searches

Should unsuccessful searches be returned (TRUE/FALSE)

max.rt.diff_sec

maximum difference between user.rt in position of peak maximum in seconds

max.mz.diff_ppm

maximum difference between intensity weighted mz of a peak and the calculated mz of the expected ion species in ppm

max_bias_area

maximal allowed area bias for isotopologues to be excepted

max_bias_height

maximal allowed height bias for isotopologues to be excepted

area_height_bias_diff

maximal allowed difference between height and area bias

Details

molecule: name of molecule

adduct: adduct type

isoab: theoretic relative abundance as predicted via enviPat

FileName: sample name

eic_mzmin: lowest mz value in extracted ROI

eic_mzmax: highest mz value in extracted ROI

formula: molecular formula

charge: ion charge

mz_ex: exact mass as predicted via enviPat

Grp: sample group

peaks.rtmin: peak start time (s)

peaks.rtmax: peak end time (s)

peaks.PpP: scans per peak

peaks.mz_accurate: peak mz calculated as intensity weighted average

peaks.mz_accuracy_abs: absolute mz accuracy as compared to mz_ex

peaks.mz_accuracy_ppm: relative mz accuracy as compared to mz_ex

peaks.mz_span_abs: absolute difference between the highest and lowest mz value recorded over chromatographic peak

peaks.mz_span_ppm: relative difference between the highest and lowest mz value recorded over chromatographic peak

peaks.mz_min: lowest mz value recorded over chromatographic peak

peaks.mz_max: highest mz value recorded over chromatographic peak

peaks.FW25M: chromatographic peak width at 25

peaks.FW50M: chromatographic peak width at 50

peaks.FW75M: chromatographic peak width at 75

peaks.data_rate: mean differences between scans (s)

peaks.rt_raw: position of the highest intensity (s)

peaks.zigZag_IDX: peak zigzag index as calculated in Zhang, W., Zhao, P.X. Quality evaluation of extracted ion chromatograms and chromatographic peaks in liquid chromatography/mass spectrometry-based metabolomics data. BMC Bioinformatics 15, S5 (2014). 10.1186/1471-2105-15-S11-S5 (R function taken from Chetnik,K. et al. (2020) MetaClean: a machine learning-based classifier for reduced false positive peak detection in untargeted LC–MS metabolomics data. Metabolomics, 16, 117.)

peaks.sharpness: peak sharpness as calculated in Zhang, W., Zhao, P.X. Quality evaluation of extracted ion chromatograms and chromatographic peaks in liquid chromatography/mass spectrometry-based metabolomics data. BMC Bioinformatics 15, S5 (2014). 10.1186/1471-2105-15-S11-S5 (R function taken from Chetnik,K. et al. (2020) MetaClean: a machine learning-based classifier for reduced false positive peak detection in untargeted LC–MS metabolomics data. Metabolomics, 16, 117.)

peaks.jaggedness: peak jaggedness as calculated in Eshghi,S.T. et al. Quality assessment and interference detection in targeted mass spectrometry data using machine learning. Clinical Proteomics. (R function taken from Chetnik,K. et al. (2020) MetaClean: a machine learning-based classifier for reduced false positive peak detection in untargeted LC–MS metabolomics data. Metabolomics, 16, 117.)

peaks.symmetry: peak symmetry as calculated in Eshghi,S.T. et al. Quality assessment and interference detection in targeted mass spectrometry data using machine learning. Clinical Proteomics. (R function taken from Chetnik,K. et al. (2020) MetaClean: a machine learning-based classifier for reduced false positive peak detection in untargeted LC–MS metabolomics data. Metabolomics, 16, 117.)

peaks.rt_neighbors: reports whether extension of integration boundaries in the RT dimension in left or right increases the calculated chromatographic peak area by more than 20

peaks.mz_neighbors: ratio between the chromatographic peak area and the chromatographic peak area calculated after increasing the width of the mz-extraction window by 4 * max.mz.diff_ppm (worst allowed mass accuracy)

peaks.height: highest intensity of the peak

peaks.area: chromatographic peak area

peaks.cor_w_M0: pearson correlation coefficient between most abundant isotopologue (isoab = 100) and lower isotopologues

peaks.cor_w_main_add: pearson correlation coefficient between most abundant isotopologue of main_adduct (isoab = 100) and most abundant isotopologues of other adducts of the same compound

peaks.manual_int: True if user.rtmin and user.rtmax were provided

Intensities.v: intensity vector of extracted chromatogram

RT.v: retention time vector of extracted chromatogram

ExpectedArea: Predicted chromatrographic peak area for lower isotopologues as calculated from most abundant isotopologue of the same molecule and adduct

ErrorRel_A: relative difference between ExpectedArea and peaks.area

ErrorAbs_A: absolute difference between ExpectedArea and peaks.area

ExpectedHeight: predicted chromatrographic peak height for lower isotopologues as calculated from most abundant isotopologue of the same molecule and adduct

ErrorRel_H: relative difference between ExpectedHeight and peaks.height

ErrorAbs_H: absolute difference between ExpectedHeight and peaks.height

isoab_ol: true if isotopologue abundance is considered to be too far off as compared to predicted isoab

Iso_count: isotopologue count per file, molecule and adduct

samples_per_group: number of samples per group

iso_id: id for specific isotopologue

rt_raw_span: Max RT difference within a given isotopologue of a given molecule and adduct

Value

data table with peak variables extracted from found peaks.


YasinEl/mzRAPP documentation built on Feb. 18, 2024, 11:49 a.m.