Canopy: Accessing Intra-Tumor Heterogeneity and Tracking Longitudinal and Spatial Clonal Evolutionary History by Next-Generation Sequencing

A statistical framework and computational procedure for identifying the sub-populations within a tumor, determining the mutation profiles of each subpopulation, and inferring the tumor's phylogenetic history. The input are variant allele frequencies (VAFs) of somatic single nucleotide alterations (SNAs) along with allele-specific coverage ratios between the tumor and matched normal sample for somatic copy number alterations (CNAs). These quantities can be directly taken from the output of existing software. Canopy provides a general mathematical framework for pooling data across samples and sites to infer the underlying parameters. For SNAs that fall within CNA regions, Canopy infers their temporal ordering and resolves their phase. When there are multiple evolutionary configurations consistent with the data, Canopy outputs all configurations along with their confidence assessment.

AuthorYuchao Jiang, Nancy R. Zhang
Date of publication2017-04-08 20:30:16 UTC
MaintainerYuchao Jiang <>

addsamptree: To determine whether the sampled tree will be accepted

AML43: SNA input for primary tumor and relapse genome of leukemia...

canopy.BIC: To get BIC as a model selection criterion

canopy.cluster: EM algorithm for multivariate clustering of SNAs

canopy.cluster.Estep: E-step of EM algorithm for multivariate clustering of SNAs

canopy.cluster.Mstep: M-step of EM algorithm for multivariate clustering of SNAs

canopy.output: To generate a posterior tree

canopy.plottree: To plot tree inferred by Canopy Posterior evaluation of MCMC sampled trees

canopy.sample: MCMC sampling in tree space

canopy.sample.cluster: MCMC sampling in tree space with pre-clustering of SNAs

canopy.sample.cluster.nocna: MCMC sampling in tree space with pre-clustering of SNAs

canopy.sample.nocna: MCMC sampling in tree space

getclonalcomposition: To get clonal composition

getCMCm: To get major and minor copy per clone

getCZ: To get CNA genotyping matrix CZ

getlikelihood: To get likelihood of the tree

getlikelihood.sna: To get SNA likelihood of the tree

getQ: To get SNA-CNA genotyping matrix

getVAF: To get variant allele frequency (VAF)

getZ: To get SNA genotyping matrix Z

initialcna: To initialize positions of CNAs

initialcnacopy: To initialize major and minor copies of CNAs

initialP: To initialize clonal frequency matrix

initialsna: To initialize positions of SNAs

MDA231: Dataset for project MDA231

MDA231_sampchain: List of pre-sampled trees

MDA231_tree: Most likely tree from project MDA231

sampcna: To sample CNA positions

sampcnacopy: To sample major and minor copies of CNAs

sampP: To sample clonal frequency

sampsna: To sample SNA positions

sampsna.cluster: To sample positions of SNA clusters

sortcna: To sort identified overlapping CNAs.

toy: Toy dataset for Canopy

toy2: Toy dataset 2 for Canopy

toy3: Toy dataset 3 for Canopy


