ExtractTopFeatures: Extracting top driving genes of GoM clusters

Description Usage Arguments Value Examples

View source: R/ExtractTopFeatures.R

Description

This function uses relative gene expression profile of the GoM clusters and applies a KL-divergence based method to obtain a list of top features that drive each of the clusters.

Usage

1
2
ExtractTopFeatures(theta, top_features = 10, method = c("poisson",
  "bernoulli"), options = c("min", "max"), shared = FALSE)

Arguments

theta

\boldsymbol{theta} matrix, the relative gene expression profile of the GoM clusters (cluster probability distributions) from the GoM model fitting (a G x K matrix where G is number of features, K number of topics).

top_features

The top features in each cluster k that are selected based on the feature's ability to distinguish cluster k from cluster 1, …, K for all cluster k \ne l. Default: 10.

method

The underlying model assumed for KL divergence measurement. Two choices considered are "bernoulli" and "poisson". Default: poisson.

options

if "min", for each cluster k, we select features that maximize the minimum KL divergence of cluster k against all other clusters for each feature. If "max", we select features that maximize the maximum KL divergence of cluster k against all other clusters for each feature.

shared

if TRUE, then we report genes that can be highly expressed in more than one cluster. Else, we stick to only those genes that are highest expressed only in a specific cluster.

Value

A matrix (K x top_features) which tabulates in k-th row the top feature indices driving the cluster k.

Examples

1
2
3
4
5
data("MouseDeng2014.FitGoM")
theta_mat <- MouseDeng2014.FitGoM$clust_6$theta;
top_features <- ExtractTopFeatures(theta_mat, top_features=100, method="poisson", options="min");
top_features$indices
top_features$scores

kkdey/CountClust documentation built on Jan. 17, 2021, 5:32 p.m.