expression_filter: Takes a data.frame, replaces any missing value(s) with zeros,...

View source: R/OmicsAnalyst.R

expression_filterR Documentation

Takes a data.frame, replaces any missing value(s) with zeros, optionally performs counts per million (CPM) normalization, calculates the row-wise statistic provided by FilterFun, and filters the data with above the FilterThreshold or keeps only the top n provided by RankThreshold. Also, the output of expression filter is a two element list. 'final' is a data frame showing the filtered and if specified, transformed, data and a column with whatever statistic the user specified to filter it by. This is to visualize what features had what filter/rank statistic. 'dat' is the raw data (not CPM normalized). If specified, 'dat' will be a DGEList.

Description

Takes a data.frame, replaces any missing value(s) with zeros, optionally performs counts per million (CPM) normalization, calculates the row-wise statistic provided by FilterFun, and filters the data with above the FilterThreshold or keeps only the top n provided by RankThreshold. Also, the output of expression filter is a two element list. 'final' is a data frame showing the filtered and if specified, transformed, data and a column with whatever statistic the user specified to filter it by. This is to visualize what features had what filter/rank statistic. 'dat' is the raw data (not CPM normalized). If specified, 'dat' will be a DGEList.

Usage

expression_filter(
  dat,
  DGEList = FALSE,
  CPM = TRUE,
  CPH = FALSE,
  FilterFUN = mean,
  FilterThreshold = NULL,
  RankThreshold = NULL
)

Arguments

dat

A data.frame

DGEList

Logical. Is input data a DGEList? If TRUE, input data is handled as a list with a data.matrix/frame in it and not a data.frame alone. Also, CPM normalization will be performed if DGEList is set to TRUE

CPM

Logical. Whether or not to normalize columns by CPM

CPH

Logical. If there are very few mapped reads, your depth may not be in millions, but in hundreds or thousands. Therefore CPM normalization will skew expression very high. CPH normalization, or normalizing by hundreds factor, may provide better results.

FilterFUN

Row-wise function to use for filtering (mean, max, median, IQR, etc..).

FilterThreshold

Threshold value. Features with FilterFUN output higher than this value will be kept.

RankThreshold

Threshold value. Features will be ranked with FilterFUN and the top n values provided by RankThreshold will be kept.


hsaxe/Omics-Analyst documentation built on Sept. 7, 2024, 12:15 p.m.