clusterCellFrequencies: Clustering of cellular frequency probability distributions
In expands: Expanding Ploidy and Allele-Frequency on Nested Subpopulations

Description Usage Arguments Details Value Author(s) References

Calculates overrepresented cell frequencies using a two-step approach. Based on the assumption that passenger mutations occur within a cell prior to the driver event that initiates the expansion, each clonal expansion should be marked by multiple mutations. Thus mutations and copy number variations that took place in a cell prior to a clonal expansion should be present in a similar fraction of cells and leave a similar "frequency-trace" during their propagation.

1	clusterCellFrequencies(densities, p, nrep=30, min_CF=0.1, verbose = T)

`densities`	Matrix as obtained by `computeCellFrequencyDistributions.`Each row corresponds to a mutation and each column corresponds to a cellular frequency. Each value densities[i,j] represents the probability that mutation i is present in a fraction f of cells, where f is given by: colnames(densities[,j]).
`p`	Precision with which subpopulation size is predicted, a small value reflects a high resolution and can lead to a higher number of predicted subpopulations.
`nrep`	Positive integer indicating the number of algorithm repetitions (default: 30).
`min_CF`	Lower threshold for the prevalence of a mutated cell (default: 0.1).
`verbose`	Give a more verbose output.

In the first step, mutations with similar cellular frequencies are grouped together by hierarchical cluster analysis of the probability distributions using the Kullback-Leibler divergence as a distance measure. The cell frequency at each cluster-maxima denotes the size of the subpopulation that harbors the clustered mutations. In the second step, each cluster is extended by members with similar distributions in an interval around the cluster-maxima.

SPs

Matrix of predicted subpopulations. Each row corresponds to a subpopulation and each column contains information about that subpopulation, such as the size in the sequenced tumor bulk (column Mean Weighted) and the noise score at which the subpopulation has been detected (column score: lower values ~ higher subpopulation detection confidence).

Noemi Andor

Noemi Andor, Julie Harness, Sabine Mueller, Hans Werner Mewes and Claudia Petritsch. (2013) ExPANdS: Expanding Ploidy and Allele Frequency on Nested Subpopulations. Bioinformatics.

expands documentation built on Sept. 5, 2021, 5:18 p.m.