varFilter: Variation-based Filtering of Features (CpG sites) in a...

Description Usage Arguments Details Value Author(s) References See Also Examples

Description

The function varFilter removes features exhibiting little variation across samples. Such non-specific filtering can be advantageous for downstream data analysis.

Usage

1
varFilter(eset, var.func=IQR, var.cutoff=0.5, filterByQuantile=TRUE, ...)

Arguments

eset

An MethyLumiSet or MethyLumiM object.

var.func

The function used as the per-feature filtering statistics.

var.cutoff

A numeric value indicating the cutoff value for variation. If filterByQuantile is TRUE, features whose value of var.func is less than var.cutoff-quantile of all var.func value will be removed. It FALSE, features whose values are less than var.cutoff will be removed.

filterByQuantile

A logical indicating whether var.cutoff is to be interprested as a quantile of all var.func (the default), or as an absolute value.

...

Unused, but available for specializing methods.

Details

This function is a counterpart of functions nsFilter and varFilter available from the genefilter package. See R. Bourgon et. al. (2010) and nsFilter for detail.

It is proven that non-specific filtering, for which the criteria does not depend on sample class, can increase the number of discoverie. Inappropriate choice of test statistics, however, might have adverse effect. limma's moderated t-statistics, for example, is based on empirical Bayes approach which models the conjugate prior of gene-level variance with an inverse of χ^2 distribution scaled by observed global variance. As the variance-based filtering removes the set of genes with low variance, the scaled inverse χ^2 no longer provides a good fit to the data passing the filter, causing the limma algorithm to produce a posterior degree-of-freedom of infinity (Bourgon 2010). This leads to two consequences: (i) gene-level variance estimate will be ignore, and (ii) the p-value will be overly optimistic (Bourgon 2010).

Value

The function featureFilter returns a list consisting of:

eset

The filtered MethyLumiSet or MethyLumiM object.

filter.log

Shows many low-variant features are removed.

Author(s)

Chao-Jen Wong cwon2@fhcrc.org

References

R. Bourgon, R. Gentleman, W. Huber, Independent filtering increases power for detecting differentially expressed genes, PNAS, vol. 107, no. 21, pp:9546-9551, 2010.

See Also

nsFilter

Examples

1
2
3
4
5
  data(mldat)
  ## keep top 75 percent
  filt <- varFilter(mldat, var.cutoff=0.25)
  filt$filter.log
  dim(filt$eset)

methylumi documentation built on Nov. 8, 2020, 6:26 p.m.