filter_samples_by_mv: Filter samples by missing values

View source: R/filters.R

filter_samples_by_mvR Documentation

Filter samples by missing values


Missing values in mass spectrometry metabolomic datasets occur widely and can originate from a number of sources, including for both technical and biological reasons. In order for robust conclusions to be drawn from down-stream statistical testing procedures, the issue of missing values must first be addressed. This tool facilitates the removal of samples containing a user-defined maximum percentage of missing values.


filter_samples_by_mv(df, max_perc_mv, classes = NULL, remove_samples = TRUE)



A matrix-like (e.g. an ordinary matrix, a data frame) or RangedSummarizedExperiment-class object with all values of class numeric() or integer() of peak intensities, areas or other quantitative characteristic.


numeric(1), Value between 0 and 1 of threshold of missing value percentage in sample.


character(), vector of class labels. Must be the same length as the number of sample in the input peak table. If input is SummarizedExperiment object, use SummarizedExperiment_object$meta_data_column_name.


logical(1), remove blank samples from peak matrix or not.


Object of class SummarizedExperiment. If input data are a matrix-like (e.g. an ordinary matrix, a data frame) object, function returns numeric() matrix-like object of filtered data set. Function flags are added to the object attributes and is a DataFrame-class with five columns. The same DataFrame object containing flags is added to rowData() element of SummarizedExperiment object as well. If element colData() already exists flags are appended to existing values.

Columns in colData() or flags element contain:
perc_mv numeric(), fraction of missing values per sample;
flags integer(),if 0 feature is flagged to be removed.


df <- MTBLS79
out <- filter_samples_by_mv (df=df, max_perc_mv=0.8)

computational-metabolomics/pmp documentation built on April 30, 2022, 4:28 a.m.