results/robustness.md

Robustness of BioTIP

Figure S 1. Robustness of BioTIP a, Results of the hESC data running on different clustering methods and variable key parameters (y-labels). Listed in parentheses is the defined number of cell clusters. Red start indicates the published clusters for this dataset, based on which the BioTIP predictions serve as the gold standard (GS). Left: Green bars showing the Jaccard scores quantifying a CT detection against the GS – the established CT at primitive streak (PS) and the repeatedly detected CT at cardiac progenitor (CP) (by three tools -- BioTIP, MuTrans, and QuanTC). Green squares checks when a prediction includes a transition state. Blue bars show the normalized F1 scores indicating each CT state, respectively. Red squares check if a CTS contains a previously evaluated transition markers at day 2.5 (around the PS state), respectively for each run. Right: ROC plot comparing five clustering methods (with optimal parameters) that detected both established markers as CTS members at PS, using nine consistently identified CTS member genes as a proxy gold standard (PGS). AUC scores are given in parentheses.

b-e, Similar to left panel a but using proxy gold standard (PGS), showing the results in four independent datasets, respectively. There are no evaluated transition marker genes for any of the datasets. Therefore, we infer two types of PGSs for each dataset. (1) For CT detection, the PGS is the known transition state in the system and/or the one predicted by both BioTIP and QuanTC. An exception is panel c, in which the HP state is additionally considered because it has significant and stable CTS detected by BioTIP from downsized data (Fig S7d). These CT clusters serving as PGS are listed on the top-right of the green bars. (2) For CTS identification, the PGS are consistently predicted genes indicating a CT state, specified atop the blue bars. The BioTIP results demonstrated in main Figures are highlighted in red x-label per dataset. SNNGraph: nearest-neighbor graph clustering; Soft.wo.TC: the stable states defined by soft clustering approach using QuanTC pipeline then excluding the transition cells. Abbreviations: hESC: human embryonic stem cells; EB: embryoid body; EMT: epithelial-to-mesenchymal transition; SNNGraph: nearest-neighbor graph; TC: transition cell; QuanTC: a model-free method to detect transition cells; E16.5: embryonic day 16.5; eHEP: early haemato-endothelial progenitor; lHRP: later haemato-endothelial progenitor; EP: endothelial progenitor; HP: haematopoietic progenitor; FLK1: FLK1-expressing (FLK1+) mesoderm; eMeso: early mesoderm; CP: cardiomyocyte progenitor.



xyang2uchicago/BioTIP documentation built on June 30, 2024, 10:14 p.m.