calculatePAC: The calculatePAC function
In mpru/ConsensusClustering: An R Package for Consensus Clustering

Description Usage Arguments Details Value References

Internal. Calculates the Proportion of Ambiguous Clustering (PAC) for each K, according to Senbabaogly (2014).

1	calculatePAC(results, lowerLim = 0.1, upperLim = 0.9)

`results`	output from getFinalPartition function.
`lowerLim`	lower limit for the interval of ambiguous clustering used for calculating PAC score, belongs to the interval (0, 1).
`upperLim`	upper limit for the interval of ambiguous clustering used for calculating PAC score, belongs to the interval (0, 1).

The CDF plot shows the cumulative distribution functions of the consensus indexes for all pairs of samples for each k (indicated by colors). The empirical CDF plot holds the cumulative distribution function (CDF) values on the y and the consensus index values on the x-axis. In the CDF curve, the lower left portion represents sample pairs rarely clustered together, the upper right portion represents those almost always clustered together, whereas the middle portion represents those with occasional co-assignments in different clustering runs. The CDF curves show a flat middle segment for the true K, suggesting that very few sample pairs are ambiguous when K is correctly inferred. The PAC score can be used to quantify this characteristic. The Proportion of Ambiguous Clustering (PAC) is the fraction of sample pairs that hold consensus index values within a given sub-interval (x1, x2) in [0,1] (usually, x1 = 0.1 and x2 = 0.9). The CDF values correspond to the fraction of sample pairs with a consensus index values less or equal to the value 'c'. The PAC is then calculated by CDF(x2) - CDF(x1), optimal K should present a low PAC score.

A data frame with PAC scores for each K.

Senbabaoglu, Y et al (2014) Critical limitations of consensus clustering in class discovery. Scientific Reports, 4, Article number 6207.

mpru/ConsensusClustering documentation built on May 9, 2019, 5:54 a.m.