featurefilter: featurefilter: A function for filtering features

Description Usage Arguments Value References Examples

View source: R/featurefilter.R

Description

This function is to filter features based on variance. Depending on the data different metrics will be more appropiate, simple variance is included if variance does not tend to increase with the mean. There is also the median absolute deviation which is a more robust metric than variance, this is preferable. The coefficient of variation (A) or its second order derivative (A2) (Kvalseth, 2017) are also included which standardise the standard deviation with respect to the mean. It is best to manually examine the mean-variance relationship of the data, for example, using the results from this function together with the qplot function from ggplot2.

Usage

1
featurefilter(mydata, percentile = 10, method = "MAD", topN = 20)

Arguments

mydata

Data frame: should have samples as columns and rows as features

percentile

Numerical value: the top X percent most variable features should be kept

method

Character vector: variance (var), coefficient of variation (A), second order A (A2), median absolute deviation (MAD)

topN

Numerical value: the number of most variable features to display

Value

A list, containing: 1) filtered data 2) statistics for each feature order according to the defined filtering metric

References

Kvålseth, Tarald O. "Coefficient of variation: the second-order alternative." Journal of Applied Statistics 44.3 (2017): 402-415.

Examples

1
filtered <- featurefilter(mydata,percentile=10)

crj32/M3C documentation built on Feb. 19, 2020, 11:39 p.m.