Calculate the conditional probability of belonging to each cluster in a Poisson mixture model

Share:

Description

This function computes the conditional probabilities t_{ik} that an observation i arises from the kth component for the current value of the mixture parameters.

Usage

1
probaPost(y, g, conds, pi, s, lambda)

Arguments

y

(n x q) matrix of observed counts for n observations and q variables

g

Number of clusters

conds

Vector of length q defining the condition (treatment group) for each variable (column) in y

pi

Vector of length g containing the current estimate of π

s

Vector of length q containing the estimates for the normalized library size parameters for each of the q variables in y

lambda

(d x g) matrix containing the current estimate λ, where d is the number of conditions (treatment groups)

Value

t

(n x g) matrix made up of the conditional probability of each observation belonging to each of the g clusters

Note

If all values of t_{ik} are 0 (or nearly zero), the observation is assigned with probability one to belong to the cluster with the closest mean (in terms of the Euclidean distance from the observation). To avoid calculation difficulties, extreme values of t_{ik} are smoothed, such that those smaller than 1e-10 or larger than 1-1e-10 are set equal to 1e-10 and 1-1e-10, respectively.

Author(s)

Andrea Rau <andrea.rau@jouy.inra.fr>

References

Rau, A., Maugis-Rabusseau, C., Martin-Magniette, M.-L., Celeux G. (2015). Co-expression analysis of high-throughput transcriptome sequencing data with Poisson mixture models. Bioinformatics, 31(9):1420-1427.

Rau, A., Celeux, G., Martin-Magniette, M.-L., Maugis-Rabusseau, C. (2011). Clustering high-throughput sequencing data with Poisson mixture models. Inria Research Report 7786. Available at http://hal.inria.fr/inria-00638082.

See Also

PoisMixClus for Poisson mixture model estimation and model selection; PoisMixMean to calculate the conditional per-cluster mean of each observation

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
set.seed(12345)

## Simulate data as shown in Rau et al. (2011)
## Library size setting "A", high cluster separation
## n = 200 observations

simulate <- PoisMixSim(n = 200, libsize = "A", separation = "high")
y <- simulate$y
conds <- simulate$conditions
s <- colSums(y) / sum(y)     ## TC estimate of lib size

## Run the PMM-II model for g = 3
## "TC" library size estimate, EM algorithm

run <- PoisMixClus(y, g = 3, norm = "TC",
 	conds = conds) 
pi.est <- run$pi
lambda.est <- run$lambda

## Calculate the conditional probability of belonging to each cluster
proba <- probaPost(y, g = 3, conds = conds, pi = pi.est, s = s, 
	lambda = lambda.est)

## head(round(proba,2))