screenSpectra: Identification of potentially low-quality raw mass spectra

View source: R/screenSpectra.R

screenSpectraR Documentation

Identification of potentially low-quality raw mass spectra

Description

This function implements a quality control check to help in the identification of possibly faulty, low-quality raw mass spectra. It computes an atypicality score and labels suspicious profiles for further inspection and filtering.

Usage

screenSpectra(x, meta = NULL, threshold = 1.5, estimator = c("Q", "MAD"),
                 method = c("adj.boxplot", "boxplot", "ESD", "Hampel", "RC"),
                 nd = 1, lambda = 0.5, ...)

Arguments

x

A list of MassSpectrum objects.

meta

(optional) Matrix or vector containing metadata associated to x. Typically a data matrix including spectrum ID, biotype, replicate number, etc. for each element of x.

threshold

Multiplicative factor used in computing the upper and lower fences to determine passes and failures. It is related to the actual method used to compute the fences (see method). Typically, threshold = 1.5 (default value) for the boxplot rules, and threshold = 3 for the others.

estimator

Robust scale estimator used:

Q: robust location-free scale estimate (default, see Qn function in robustbase package). More efficient than MAD and adequate for non-symmetric distributions.

MAD: median absolute deviance scale estimate. Very robust and preferred for fairly symmetric distributions.

method

Method used to compute upper and lower fences for the identification of atypical mass spectra.

boxplot: standard boxplot rule based on the first and third quartiles and the interquartile range.

adj.boxplot: extension of boxplot rule for strongly asymmetric data (default).

ESD: extreme studentized deviation method. Based on the mean and the standard deviation of the data. Typically used with threshold = 3 (three-sigma rule).

Hampel: robust version of the ESD method based on the median and the median absolute deviance estimate (MAD).

RC: as Hampel's but replacing MAD by Rousseeuw & Croux (1993)'s Qn as scale estimate.

nd

Order for the derivative function of the mass spectra (default = 1).

lambda

Weight given to each component of the atypicality score (values in [0, 1], default = 0.5, see details below).

...

Other arguments.

Details

The procedure computes an atypicality score (A score) based on a weighted function of two components: (1) a robust scale estimator (Q or MAD) of the n-order derivative (computed using Savitzky-Golay smoothing filter) of scaled mass spectra and (2) the median intensity of the signals. Given a method to determine tolerance fences, a mass spectrum is labelled as potentially faulty, low-quality according to the magnitude of its A score. The adj.boxplot method based on the Q scale estimator and equal weights to both components (lambda = 0.5) are the default options. The greater lambda the higher the weight given to the scale estimator in the A score. The function produces summaries and a list of mass spectra and (if given) associated metadata in which the identified cases were filtered out.

Value

An object of class scSpectra with elements:

fspectra

List of mass spectra (MassSpectrum class) with potential low-quality cases filtered out.

fmeta

Associated filtered metadata (data.frame object).

est.table

Results table showing the mass spectra ID, A score and label (pass/failure).

...

Other details (see method summary.scSpectra for scSpectra objects).

See Also

See methods summary.scSpectra and plot.scSpectra for scSpectra objects.

Examples


# Load example data

data(spectra) # list of MassSpectra objects
data(type)    # metadata

# Results using different settings

sc.results <- screenSpectra(spectra)
sc.results <- screenSpectra(spectra, type)
sc.results <- screenSpectra(spectra, type, method = "RC")
sc.results <- screenSpectra(spectra, type, threshold = 3, estimator = "MAD", method = "Hampel")

# Numerical and graphical summary

summary(sc.results)
plot(sc.results)

# Save filtered data for further pre-processing

filtered.spectra <- sc.results$fspectra
filtered.type <- sc.results$fmeta

MALDIrppa documentation built on March 29, 2022, 1:05 a.m.