msPeakFilterMA: Basic peak filtering
In flajole/MSdata: Mass-spectrometry data analysis package

Description Usage Arguments Details Value

The purpose of the data filtering is to identify and remove variables that are unlikely to be of use when modeling the data. No phenotype information are used in the filtering process, so the result can be used with any downstream analysis. This step is strongly recommended for untargeted metabolomics datasets (i.e. spectral binning data, peak lists) with large number of variables, many of them are from baseline noises. Filtering can usually improve the results.

1 2	## S4 method for signature 'MSdata' msPeakFilterMA(msdata, method = "none")

`msdata`	`MSdata-class` object to be filtered
`method`	Method of filtering one of: `"none"` - no filtering applied `"iqr"` - interquantile range (IQR) `"sd"` - standard deviation (SD) `"mad"` - median absolute deviation (MAD) `"rsd"` - relative standard deviation (RSD = SD/mean) `"nprsd"` - non-parametric relative standard deviation (MAD/median) `"mean"` - mean intensity value `"median"` - median intensity value

Non-informative variables can be characterized in two groups:

variables of very small values - can be detected using mean or median;
variables that are near-constant throughout the experiment conditions - can be detected using different variance measures.

The following empirical rules are applied during data filtering:

less than 250 variables: 5% will be filtered;
between 250 - 500 variables: 10% will be filtered;
between 500 - 1000 variables: 25% will be filtered;
over 1000 variables: 40% will be filtered.

Please note, that "none" option is only for less than 2000 features. Over that, if you choose "none", the IQR filter will still be applied.

The maximum allowed number of variables is 5000. If over 5000 variables were left after filtering, only the top 5000 will be used in the subsequent analysis.

MSdata-class object without filtered peaks

flajole/MSdata documentation built on May 16, 2019, 1:17 p.m.