Consensus_FS: Consensus Feature Selection
In tallulandrews/M3D: Michaelis-Menten Modelling of Dropouts in single-cell RNASeq

Description Usage Arguments Details Value Examples

Performs seven different feature selection methods then calculates the consensus ranking of features from that.

1	Consensus_FS(counts, norm=NA, is.spike=rep(FALSE, times=nrow(counts)), pcs=c(2,3), include_cors=TRUE)

`counts`	raw count matrix, rows=genes, cols=cells
`norm`	normalized but not log-transformed gene expression matrix, rows=genes, cols=cells
`is.spike`	logical, vector of whether each gene is/isn't a spike-in
`pcs`	which principle components to use to score genes
`include_cors`	logical, whether to perform gene-gene correlation feature selection which is much slower than all other methods.

Performs: NBumiFeatureSelectionCombinedDrop (aka: DANB_drop) NBumiFeatureSelectionHighVar (aka: DANB_var) BrenneckeGetVariableGenes (aka: HVG) M3DropFeatureSelection (aka: M3Drop) giniFS irlbaPcaFS (with provided PCs) corFS ("both" direction)

Genes are ranked by each method and the consensus (Cons) is calculated as the average of those ranks.

Automatically removes invariant genes. If only raw counts are provided then will apply counts per million normalization (scaled to the median library size) for those methods which require normalized data.

Table of ranking of each gene by each method including the consensus (Cons). Columns are feature selection methods named using the shorter aliases (see: Details).

library(M3DExampleData)
norm <- as.matrix(Mmus_example_list$data[1:500,]);
norm <- norm[rowSums(norm) > 0,];
counts <- NBumiConvertToInteger(norm);
spikes <- sample(1:nrow(counts), 50);
spikes <- rownames(norm)[spikes];
spikes <- rownames(norm) 
Features_consensus <- Consensus_FS(counts, norm, is.spike=spikes);