View source: R/preprocess_data.R
preprocess_spectra | R Documentation |
Performs smoothening, baseline removal and peak detection on MALDI samples. From the peaks, isotopic peaks for a list of peptides are extracted.
preprocess_spectra(
indir,
metadata,
make_plots = FALSE,
peptides_user = NULL,
smooth_wma_hws = 4,
smooth_sg_hws = 6,
iterations = 50,
halfWindowSize = 20,
snr = 2,
k = 0L,
threshold = 0.33,
local_bg = FALSE,
mass_range = 100,
bg_cutoff = 0.5,
l_cutoff = 1e-08,
tolerance = 0.4,
ppm = 50,
n_isopeaks = 5,
min_isopeaks = 4,
ncores = NULL,
chunk_size = 40
)
indir |
Folder containing spectra in mzML format. |
metadata |
Data frame with spectra metadata with at least |
smooth_wma_hws |
Half-window size for WeightedMovingAverage smoothing method |
smooth_sg_hws |
Half-window size for SavitzkyGolay smoothing method |
iterations |
Iterations parameter for baseline detection. |
halfWindowSize |
Half-window size parameter for local maximum detection. |
snr |
Signal-to-noise threshold above which peaks are considered |
k |
k parameter for |
threshold |
threshold parameter for |
local_bg |
Whether to further to clean peaks of lists by modelling the local
background noise. See MALDIzooMS::peaks_local_bg.
Ideally should work with a |
mass_range |
Mass window to both sides of a peak to be considered for backgroun modelling |
bg_cutoff |
The peaks within the mass range with intensity below the |
l_cutoff |
Likelihood threshold or p-value. Peaks with a probability of being modelled as background noise higher than this are filtered out. |
tolerance |
Mass tolerance in Da between |
ppm |
Parts-per-million added to tolerance. See MsCoreUtils::closest |
n_isopeaks |
Number of isotopic peaks to pick. Default is 5 and the maximum permitted. |
min_isopeaks |
If less than min_isopeaks consecutive (about 1 Da difference) isotopic peaks are detected, the whole isotopic envelope is discarded. Default is 4 |
ncores |
Number of cores used by the Spectra::MsBackendMzR backend in Spectra::peaksData |
mono_masses |
Array with the peptides monoisotopics masses |
The default peptides are the ones from Nair et al. (2022). The paper contains the details on the preprocessing procedure.
A list of dataframes, 1 per sample. Each dataframe has 3 columns, m/z, intensity and signal-to-noise ratio for each of the n_isopeaks from each peptide. Missing peaks are NAs.
Nair, B. et al. (2022) ‘Parchment Glutamine Index (PQI): A novel method to estimate glutamine deamidation levels in parchment collagen obtained from low-quality MALDI-TOF data’, bioRxiv. doi:10.1101/2022.03.13.483627.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.