msPeakFilterMA: Basic peak filtering

Description Usage Arguments Details Value

Description

The purpose of the data filtering is to identify and remove variables that are unlikely to be of use when modeling the data. No phenotype information are used in the filtering process, so the result can be used with any downstream analysis. This step is strongly recommended for untargeted metabolomics datasets (i.e. spectral binning data, peak lists) with large number of variables, many of them are from baseline noises. Filtering can usually improve the results.

Usage

1
2
## S4 method for signature 'MSdata'
msPeakFilterMA(msdata, method = "none")

Arguments

msdata

MSdata-class object to be filtered

method

Method of filtering one of:
"none" - no filtering applied
"iqr" - interquantile range (IQR)
"sd" - standard deviation (SD)
"mad" - median absolute deviation (MAD)
"rsd" - relative standard deviation (RSD = SD/mean)
"nprsd" - non-parametric relative standard deviation (MAD/median)
"mean" - mean intensity value
"median" - median intensity value

Details

Non-informative variables can be characterized in two groups:

The following empirical rules are applied during data filtering:

Please note, that "none" option is only for less than 2000 features. Over that, if you choose "none", the IQR filter will still be applied.

The maximum allowed number of variables is 5000. If over 5000 variables were left after filtering, only the top 5000 will be used in the subsequent analysis.

Value

MSdata-class object without filtered peaks


flajole/MSdata documentation built on May 16, 2019, 1:17 p.m.