View source: R/preprocess_data.R
preprocess_spectra | R Documentation |
Performs smoothening, baseline removal and peak detection on MALDI samples. From the peaks, isotopic peaks for a list of peptides are extracted.
preprocess_spectra(
indir = NULL,
metadata = NULL,
mzml_files = NULL,
spectrum_name_file = FALSE,
sps_mzr = NULL,
make_plots = FALSE,
peptides_user = NULL,
smooth_wma_hws = 4,
smooth_sg_hws = 6,
iterations = 50,
halfWindowSize = 20,
snr = 2,
k = 0L,
threshold = 0.33,
local_bg = FALSE,
mass_range = 100,
bg_cutoff = 0.5,
l_cutoff = 1e-08,
tolerance = 0.4,
ppm = 50,
n_isopeaks = 5,
min_isopeaks = 4,
norm_func = NULL,
q2e = NULL,
ncores = NULL,
chunk_size = 40,
verbose = FALSE
)
indir |
Folder containing spectra in mzML format. |
metadata |
Data frame with spectra metadata with at least |
mzml_files |
Paths to mzML files |
sps_mzr |
Spectra object |
smooth_wma_hws |
Half-window size for WeightedMovingAverage smoothing method |
smooth_sg_hws |
Half-window size for SavitzkyGolay smoothing method |
iterations |
Iterations parameter for baseline detection. |
halfWindowSize |
Half-window size parameter for local maximum detection. |
snr |
Signal-to-noise threshold above which peaks are considered |
k |
k parameter for |
threshold |
threshold parameter for |
local_bg |
Whether to further to clean peaks of lists by modelling the local
background noise. See MALDIzooMS::peaks_local_bg.
Ideally should work with a |
mass_range |
Mass window to both sides of a peak to be considered for backgroun modelling |
bg_cutoff |
The peaks within the mass range with intensity below the |
l_cutoff |
Likelihood threshold or p-value. Peaks with a probability of being modelled as background noise higher than this are filtered out. |
tolerance |
Mass tolerance in Da between |
ppm |
Parts-per-million added to tolerance. See MsCoreUtils::closest |
n_isopeaks |
Number of isotopic peaks to pick. Default is 5 and the maximum permitted. |
min_isopeaks |
If less than min_isopeaks consecutive (about 1 Da difference) isotopic peaks are detected, the whole isotopic envelope is discarded. Default is 4 |
norm_func |
Function to normalize the isotopic distribution |
q2e |
If provided, it adds the theoretical isotopic distribution of peptides with this extent of deamidation |
ncores |
Number of cores used by the Spectra::MsBackendMzR backend in Spectra::peaksData |
spectrum_file_name |
If mzml_files are provided, whether to use file names as spectra names. Otherwise, it is assumed the the spectra IDs are in the mzML files' headers. |
mono_masses |
Array with the peptides monoisotopics masses |
Provide the input data either using metadata
and indir
, or provide paths
with mzml_files
. You can also provide a Spectra
object directly in sps_mzr
.
If data is provided using more than one of the options, the sps_mzr
is used, and then the mzml_files
.
The default peptides are the ones from Nair et al. (2022). The paper contains the details on the preprocessing procedure.
A list of dataframes, 1 per sample. Each dataframe has 3 columns, m/z, intensity and signal-to-noise ratio for each of the n_isopeaks from each peptide. Missing peaks are NAs.
Nair, B. et al. (2022) ‘Parchment Glutamine Index (PQI): A novel method to estimate glutamine deamidation levels in parchment collagen obtained from low-quality MALDI-TOF data’, bioRxiv. doi:10.1101/2022.03.13.483627.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.