Consensus_FS: Consensus Feature Selection

Description Usage Arguments Details Value Examples

Description

Performs seven different feature selection methods then calculates the consensus ranking of features from that.

Usage

1
	Consensus_FS(counts, norm=NA, is.spike=rep(FALSE, times=nrow(counts)), pcs=c(2,3), include_cors=TRUE)

Arguments

counts

raw count matrix, rows=genes, cols=cells

norm

normalized but not log-transformed gene expression matrix, rows=genes, cols=cells

is.spike

logical, vector of whether each gene is/isn't a spike-in

pcs

which principle components to use to score genes

include_cors

logical, whether to perform gene-gene correlation feature selection which is much slower than all other methods.

Details

Performs: NBumiFeatureSelectionCombinedDrop (aka: DANB_drop) NBumiFeatureSelectionHighVar (aka: DANB_var) BrenneckeGetVariableGenes (aka: HVG) M3DropFeatureSelection (aka: M3Drop) giniFS irlbaPcaFS (with provided PCs) corFS ("both" direction)

Genes are ranked by each method and the consensus (Cons) is calculated as the average of those ranks.

Automatically removes invariant genes. If only raw counts are provided then will apply counts per million normalization (scaled to the median library size) for those methods which require normalized data.

Value

Table of ranking of each gene by each method including the consensus (Cons). Columns are feature selection methods named using the shorter aliases (see: Details).

Examples

1
2
3
4
5
6
7
8
library(M3DExampleData)
norm <- as.matrix(Mmus_example_list$data[1:500,]);
norm <- norm[rowSums(norm) > 0,];
counts <- NBumiConvertToInteger(norm);
spikes <- sample(1:nrow(counts), 50);
spikes <- rownames(norm)[spikes];
spikes <- rownames(norm) 
Features_consensus <- Consensus_FS(counts, norm, is.spike=spikes);

tallulandrews/M3D documentation built on May 31, 2019, 2:55 a.m.