descendMultiPop: DESCEND applied to two or more cell populations

Description Usage Arguments Value Examples

View source: R/deTest.R

Description

This function is used when two or more cell populations are compared with each other and is a first step for differential testing between any two of the cell populations. The true expression distribution is deconvolved for each cell population separately while Z0 is scaled to have mean 0 (combining all populations) to compute a meaningful Z0 adjusted nonzero fraction. For deconvolution of a single cell population, see runDescend. For model details, see deconvG. Depending on the number of cell types, number of cells and the dimension of Z and Z0, this function can take a very long time to run even on a cluster and occupy massive memory for the DESCEND results (as we have a DESCEND object for each cell type and each gene). In this scenario, we suggest users to run runDescend and save the descend result for each cell type separately, then follow the code inside this function for normalization of Z0 and the calculation of Z0 adjusted nonzero Fraction.

Usage

1
2
3
4
5
descendMultiPop(count.matrix, labels, ercc.matrix = NULL,
  scaling.consts = NULL, Z = NULL, Z0 = NULL, n.cores = 1,
  cl = NULL, type = "FORK", do.LRT.test = F, family = c("Poisson",
  "Negative Binomial"), NB.size = 100, show.message = T, verbose = T,
  ercc.trueMol = NULL, center.Z0 = T, control = list())

Arguments

count.matrix

the observed UMI count matrix. It should be an R object of class matrix or dgeMatrix. Each row is a gene and each column is a cell. The column sums (which should be the library sizes) are used as the input for scaling.consts when both ercc.matrix and scaling.consts are NULL.

labels

a vector of factors or characters, indicating the cell popluation label of each cell. The length of labels should be the same as the number of columns of count.matrix

ercc.matrix

the ERCC spike-ins are used for computing the cell-specific efficiency constants as scaling.consts when scaling.consts is NULL. Each row is a spike-in genes and each column is a cell. The number and order of the columns should be the same as the number and order of the columns of count.matrix.

scaling.consts

a vector of cell specific scaling constants, either the cell efficiency or the library size

Z

covariates for nonzero mean. Default is NULL.

Z0

covariates for nonzero fraction. Used only when zeroInflate is True. Default is NULL.

n.cores

the number of cores used for parallel computing. Default is 1. Used only when parallel computing is done in a single machine. For using multi-machine cores, need to assign cl explicitly. If verbose is TRUE, then a separated file is created to store the progress of each slave cores.

cl

an object of class "cluster". See more details in makeCluster

type

Default is "FORK" to save memory. Change it to "PSOCK" if you are using Windows and cl is NULL. More details see makeCluster

do.LRT.test

whether do LRT test on the coefficients and nonzero fraction or not. Default is True

family

family of the noise distribution, support either "Poisson" or "Negative Binomial" with known tuning parameter

NB.size

over-dispersion parameter when the family is Negative Binomial: mu = mu + mu^2/size

show.message

whether show messages for the computing progresses. Default is TRUE

verbose

verbose the estimation and testing procedures or not. Default is True.

ercc.trueMol

the true input number of molecules of the ercc spike-ins when ercc.matrix is not NULL.

center.Z0

whether to center Z0 to make Z0 adjusted nonzero fraction more meaningful. Default is TRUE. Set it to FALSE if Z0 has already been properly centered

control

settings see DESCEND.control

Value

a list with elements

descend.list.list

a list of DESCEND object lists. Each element is a DESCEND object list for one of the cell populations computed from runDescend.

model

model parameters, including the actual scaling.consts, Z, the rescaled Z0, control, family and NB.size

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
## Not run: 
data(zeisel)
 set.seed(1)
 ## For a Windows machine add the argument: 
 ## type = "PSOCK" to each of the function that need parallization.
 result.multi <- descendMultiPop(zeisel$count.matrix.small,
                                 labels = zeisel$labels,
                                 scaling.consts = zeisel$library.size,
                                 Z0 = log(zeisel$cell.size), verbose = FALSE, show.message = FALSE,
                                 n.cores = 3)
 ## try 100 null genes first
 detest.result <- deTest(result.multi, c("endothelial-mural", "pyramidal CA1"),
                         zeisel$count.matrix.small, zeisel$labels,
                         verbose = FALSE, show.message = FALSE,
                         N.genes.null = 100, n.cores = 3)
 
 ## 100 null genes may not get small enough p-values
 detest.result <- deTest.more(result.multi, detest.result, 
                              c("endothelial-mural", "pyramidal CA1"),
                              zeisel$count.matrix.small, labels = zeisel$labels, 
                              N.more.genes = 200, verbose = FALSE, 
                              n.cores = 3)
 
 layout(matrix(1:4, nrow = 2))
 de.scores1 <- plotDeTest(result.multi, c("endothelial-mural", "pyramidal CA1"),
                         detest.result, measurement.name = "Gini", alpha = 0.05)
 de.scores2 <- plotDeTest(result.multi, c("endothelial-mural", "pyramidal CA1"),
                         detest.result, measurement.name = "Nonzero Mean", 
                         alpha = 0.05, log = "xy")
 de.scores3 <- plotDeTest(result.multi, c("endothelial-mural", "pyramidal CA1"),
                         detest.result, measurement.name = "Nonzero Fraction", alpha = 0.1)
 de.scores4 <- plotDeTest(result.multi, c("endothelial-mural", "pyramidal CA1"),
                         detest.result, measurement.name = "Adjusted Nonzero Fraction", alpha = 0.1)

## End(Not run)

jingshuw/descend documentation built on Nov. 2, 2021, 4:23 p.m.