filterFeatures | R Documentation |
When dealing with metabolomics results, it is often necessary to filter
features based on certain criteria. These criteria are typically derived
from statistical formulas applied to full rows of data, where each row
represents a feature and its abundance of signal in each samples.
The filterFeatures
function filters features based on these conventional
quality assessment criteria. Multiple types of filtering are implemented and
can be defined by the filter
argument.
Supported filter
arguments are:
RsdFilter
: Calculates the relative standard deviation
(i.e. coefficient of variation) in abundance for each feature in QC
(Quality Control) samples and filters them in the input object according to
a provided threshold.
DratioFilter
: Computes the D-ratio or dispersion ratio, defined as
the standard deviation in abundance for QC samples divided by the standard
deviation for biological test samples, for each feature and filters them
according to a provided threshold.
PercentMissingFilter
: Determines the percentage of missing values for
each feature in the various sample groups and filters them according to a
provided threshold.
BlankFlag
: Identifies features where the mean abundance in test samples
is lower than a specified multiple of the mean abundance of blank samples.
This can be used to flag features that result from contamination in the
solvent of the samples. A new column possible_contaminants
is added to the
featureDefinitions
(XcmsExperiment
object) or rowData
(SummarizedExperiment
object) reflecting this.
For specific examples, see the help pages of the individual parameter classes listed above.
object |
|
filter |
The parameter object selecting and configuring the type of
filtering. It can be one of the following classes: |
assay |
For filtering of |
... |
Optional parameters. For |
Philippine Louail
Broadhurst D, Goodacre R, Reinke SN, Kuligowski J, Wilson ID, Lewis MR, Dunn WB. Guidelines and considerations for the use of system suitability and quality control samples in mass spectrometry assays applied in untargeted clinical metabolomic studies. Metabolomics. 2018;14(6):72. doi: 10.1007/s11306-018-1367-3. Epub 2018 May 18. PMID: 29805336; PMCID: PMC5960010.
## See the vignettes for more detailed examples
library(MsExperiment)
## Load a test data set with features defined.
test_xcms <- loadXcmsData()
## Set up parameter to filter based on coefficient of variation. By setting
## the filter such as below, features that have a coefficient of variation
## superior to 0.3 in QC samples will be removed from the object `test_xcms`
## when calling the `filterFeatures` function.
rsd_filter <- RsdFilter(threshold = 0.3,
qcIndex = sampleData(test_xcms)$sample_type == "QC")
filtered_data_rsd <- filterFeatures(object = test_xcms, filter = rsd_filter)
## Set up parameter to filter based on D-ratio. By setting the filter such
## as below, features that have a D-ratio computed based on their abundance
## between QC and study samples superior to 0.5 will be removed from the
## object `test_xcms`.
dratio_filter <- DratioFilter(threshold = 0.5,
qcIndex = sampleData(test_xcms)$sample_type == "QC",
studyIndex = sampleData(test_xcms)$sample_type == "study")
filtered_data_dratio <- filterFeatures(object = test_xcms,
filter = dratio_filter)
## Set up parameter to filter based on the percent of missing data.
## Parameter f should represent the sample group of samples, for which the
## percentage of missing values will be evaluated. As the setting is defined
## bellow, if a feature as less (or equal) to 30% missing values in one
## sample group, it will be kept in the `test_xcms` object.
missing_data_filter <- PercentMissingFilter(threshold = 30,
f = sampleData(test_xcms)$sample_type)
filtered_data_missing <- filterFeatures(object = test_xcms,
filter = missing_data_filter)
## Set up parameter to flag possible contaminants based on blank samples'
## abundance. By setting the filter such as below, features that have mean
## abundance ratio between blank(here use study as an example) and QC
## samples less than 2 will be marked as `TRUE` in an extra column named
## `possible_contaminants` in the `featureDefinitions` table of the object
## `test_xcms`.
filter <- BlankFlag(threshold = 2,
qcIndex = sampleData(test_xcms)$sample_type == "QC",
blankIndex = sampleData(test_xcms)$sample_type == "study")
filtered_xmse <- filterFeatures(test_xcms, filter)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.