feDP: feDP

View source: R/classify.dd.R

feDPR Documentation

feDP

Description

Function to identify additional DP genes, since clustering process can be consistent within each condition and still have differential proportion within each mode. The Bayes factor score also tends to be small when the correct number of clusters is not correctly detected; in that case differential proportion will manifest as a mean shift.

Usage

feDP(pe_mat, condition, sig_genes, oa, c1, c2, log.nonzero = TRUE,
  testZeroes = FALSE, adjust.perms = FALSE, min.size = 3)

Arguments

pe_mat

Matrix with genes in rows and samples in columns. Column names indicate condition.

condition

Vector of condition indicators (with two possible values).

sig_genes

Vector of the indices of significantly DD genes (indicating the row number of pe_mat)

oa

List item with one item for each gene where the first element contains the cluster membership for each nonzero sample in the overall (pooled) fit.

c1

List item with one item for each gene where the first element contains the cluster membership for each nonzero sample in condition 1 only fit

c2

List item with one item for each gene where the first element contains the cluster membership for each nonzero sample in condition 2 only fit

log.nonzero

Logical indicating whether to perform log transformation of nonzero values.

testZeroes

Logical indicating whether or not to test for a difference in the proportion of zeroes. This will only be done for genes that have at least one zero value (genes where all cells have a nonzero value will have a 'zero.pvalue' of NA).

adjust.perms

Logical indicating whether or not to adjust the permutation tests for the sample detection rate (proportion of nonzero values). If true, the residuals of a linear model adjusted for detection rate are permuted, and new fitted values are obtained using these residuals.

min.size

a positive integer that specifies the minimum size of a cluster (number of cells) for it to be used during the classification step. Any clusters containing fewer than min.size cells will be considered an outlier cluster and ignored in the classfication algorithm. The default value is three.

Details

The Fisher's Exact test is used to test for independence of condition membership and clustering when the clustering is the same across conditions as it is overall (and is multimodal). When clustering within condition is not multimodal or is different across conditions (most often the case), an FDR-adjusted t-test is performed to detect overall mean shifts.

Value

cat Character vector of the same length as sig_genes that indicates which nonsignificant genes by the permutation test belong to the DP category


kdkorthauer/scDD documentation built on March 27, 2022, 5:11 a.m.