knitr::opts_chunk$set( echo = TRUE, collapse = TRUE, comment = "#>", cache = TRUE )
Multiple primary tumors (MPT) is a special and rare cancer type, defined as more than two primary tumors presenting at the diagnosis in a single patient. The molecular characteristics and tumorigenesis of MPT remain unclear due to insufficient approaches.
Here, we present MPTevol
, a practical computational framework for comprehensively exploring the MPT from multiregional sequencing (MRS) experiments. MPTevol
facilitates comparison genomic profiles across multiple primary tumor samples, detection of clonal evolutionary history and metastatic routines in MPT, and quantification of metastatic history. This package incorporates multiple cancer evolution analyses, for a one-stop solution of MPT analysis.
Blue circles: the input data;
Green circles: the functions of MPTevol
;
Purple circles: the functions inherited from MesKit
.
You can install the development version of MPTevol
from GitHub with:
# install.packages("remotes") remotes::install_github("qingjian1991/MPTevol")
If you are using MPTevol
in academic research, please cite the following paper:
Chen, Q., Wu, Q.-N., Rong, Y.-M., Wang, S., Zuo, Z., Bai, L., . . . Zhao, Q. (2022). Deciphering clonal dynamics and metastatic routines in a rare patient of synchronous triple-primary tumors and multiple metastases with MPTevol. Briefings in Bioinformatics. doi:10.1093/bib/bbac175
MPTevol
takes the SNVs and CNVs information as the input.
The format is compatible with MesKit
.
To analyze with MPTevol
, you need to provide:
For mutation data
*.maf / *.maf.gz
). RequiredFor CNA data
Note: Tumor_Sample_Barcode
should be consistent in all input files.
Mutation Annotation Format (MAF) files are tab-delimited text files with aggregated mutations information from VCF Files. The input MAF file could be gzip compressed, and allowed values of Variant_Classification
column can be found at Mutation Annotation Format Page.
The following fields are required to be contained in the MAF file:
Hugo_Symbol
, Chromosome
, Start_Position
, End_Position
, Variant_Classification
, Variant_Type
, Reference_Allele
, Tumor_Seq_Allele2
, Ref_allele_depth
, Alt_allele_depth
, VAF
, Tumor_Sample_Barcode
Note:
Tumor_Sample_Barcode
of each sample should be unique.VAF
(variant allele frequencies) ranges from 0-1 or 0-100.Example MAF file
MAFtable <- read.table(system.file("extdata", "CRC_HZ.maf", package = "MesKit"), header = TRUE) extractLines <- rbind(MAFtable[1, ], MAFtable[6600, ]) extractLines <- rbind(extractLines, MAFtable[15000, ]) data.frame(extractLines, row.names = NULL)
Clinical data file contains clinical information about each patient and their tumor samples, and mandatory fields are Tumor_Sample_Barcode
, Tumor_ID
, Patient_ID
, and Tumor_Sample_Label
.
Example clinical data file
ClinInfo <- read.table(system.file("extdata", "CRC_HZ.clin.txt", package = "MesKit"), header = TRUE) ClinInfo[1:5, ]
By default, there are six mandatory fields in input CCF file: Patient_ID
, Tumor_Sample_Barcode
, Chromosome
, Start_Position
, CCF
and CCF_Std
/CCF_CI_High
(required when identifying clonal/subclonal mutations). The Chromosome
field of your MAF file and CCF file should be in the same format (both in number or both start with "chr"). Notably, Reference_Allele
and Tumor_Seq_Allele2
are also required if you want to include INDELs in the CCF file.
Optionally, the cluster
field can be included in this file. cluster
denotes the mutation cluster IDs.
Example CCF file
ccfInfo <- read.table(system.file("extdata", "CRC_HZ.ccf.tsv", package = "MesKit"), header = TRUE) ccfInfo[1:5, ]
The segmentation file is a tab-delimited file with the following columns:
Patient_ID
- patient IDTumor_Sample_Barcode
- tumor sample barcodeChromosome
- chromosome name or IDStart_Position
- genomic start position of segments (1-indexed)End_Position
- genomic end position of segments (1-indexed)SegmentMean/CopyNumber
- segment mean value or absolute integer copy numberMinor_CN
- copy number of minor alleleMajor_CN
- copy number of major alleleTumor_Sample_Label
- the specific label of each tumor sample.Note: Positions are in base pair units. By default, the Minor_CN
and Major_CN
fields are optional for MesKit
, but are required for MPTevol
.
Example Segmentation file
segInfo <- read.table(system.file("extdata", "CRC_HZ.seg.txt", package = "MesKit"), header = TRUE) segInfo[1:5, ]
readMaf
function creates Maf/MafList objects by reading MAF files, clinical files and cancer cell fraction (CCF) data (optional but recommended). Parameter refBuild
is used to set reference genome version for Homo sapiens reference ("hg18"
, "hg19"
or "hg38"
). You should set use.indel.ccf = TRUE
when your ccfFile
contains INDELs apart from SNVs.
library(tidyverse) library(MesKit) library(MPTevol)
Example data contains a rare patients with three primary tumors, including a Breast cancer (BRCA), a colorectal cancer(READ) and a lung cancer(LNET) were collected with Multi-region sequencing (MRS) for each primary tumors. Three Metastatic tumors, most of which are from READ, are also sequenced, including OvaryLM, OvaryRM and UterusM.
#For split data, the tumors are divided according to their histological types. data.type <- "split" maf <- readMaf( mafFile = system.file(package = "MPTevol", "extdata", sprintf("meskit.%s.mutation.txt", data.type)), ccfFile = system.file(package = "MPTevol", "extdata", sprintf("meskit.%s.CCF.txt", data.type)), clinicalFile = system.file(package = "MPTevol", "extdata", sprintf("meskit.%s.clinical.txt", data.type)), refBuild = "hg19", ccf.conf.level = 0.95 )
In order to explore the genomic alterations during cancer progression with multi-region sequencing approach, we provided classifyMut()
function to categorize mutations. The classification is based on shared pattern or clonal status (CCF data is required) of mutations, which can be specified by class
option. Additionally, option classByTumor
can be used to reveal the mutational profile within tumors.
driverGene <- read.delim(system.file(package = "MPTevol", "extdata", "IntOGen-Drivers-Cancer_Genes.tsv"), header = T) %>% filter(CANCER_TYPE %in% c("BRCA", "COREAD", "LUAD", "LUSC")) %>% pull(SYMBOL) %>% unique() mut.class <- classifyMut(maf, class = "SP", patient.id = "BRCA") head(mut.class)
The MesKit plotMutProfile
function can visualize the mutational profile of tumor samples.
plotMutProfile(maf, class = "SP", geneList = driverGene, use.tumorSampleLabel = TRUE, removeEmptyCols = FALSE)
From the driver mutational landscape, the three primary tumors, BRCA
, LNET
and READ
, in this MPT case were characterized by distinct driver genes, supporting they were derived from different tumor ancestors. Two metastatic tumors, OvaryLM
and OvaryRM
were the descendants of READ
. Whereas only 2 out 7 samples in UterusM
(UterusM_1
and UterusM_3
) were derived from READ
.
The plotCNA
function can characterize the CNA landscape across samples based on copy number data from segmentation algorithms.
```{R CNAs, fig.align='left', fig.height=6.5, fig.width=11, message=FALSE }
segCN = system.file( "extdata", "meskit.sequenza.CNAs.txt" ,package = "MPTevol")
seg = readSegment(segFile = segCN)
plotCNA(seg, chrSilent = "X")
```r # define the new columns # 0-6 copy numbers, add cnLOH, LOH >=2 copys Copy_cutoff = function(seg){ seg %>% mutate(CopyNumber1 = ifelse(CopyNumber >5, 6, CopyNumber) ) %>% mutate(CopyNumber1 = ifelse(CopyNumber == 2 & Minor_CN == 0, "cnLOH", CopyNumber1)) } seg1 = lapply(seg, Copy_cutoff) plotCNA(seg1, Type.name = "CopyNumber1", Type.colors = setNames( c("#7D8BCD", "#B1B9E7", "#F6F7F7", "#E4A8B5", "#CB7185","#B03D5E","#99143C", "#91BAA7"), nm = c(seq(0,6), "cnLOH") ), showRownames = TRUE, rect.patients.size = 0, chrSilent = "X" ) #Plot Minor_CNVs plotCNA(seg1, Type.name = "Minor_CN", Type.colors = setNames( c("#7D8BCD", "#F6F7F7", "#E4A8B5", "#CB7185","#B03D5E"), nm = 0:4 ), showRownames = TRUE, rect.patients.size = 0, chrSilent = "X" )
To quantify the genetic divergence of ITH between regions or tumors, we introduced two classical metrics derived from population genetics, which were Wright’s fixation index (Fst) and Nei’s genetic distance.
The fixation index (FST) is a measure of population differentiation due to genetic structure. It is frequently estimated from genetic polymorphism data, such as single-nucleotide polymorphisms (SNP) or microsatellites.
FST is the proportion of the total genetic variance contained in a subpopulation (the S subscript) relative to the total genetic variance (the T subscript). Values can range from 0 to 1. High FST implies a considerable degree of differentiation among populations. A particularly simple estimator applicable to DNA sequence data is
$F_{st} = \frac{T-S}{T}$
Where T
is the total genetic variance and S
is the genetic variance in subpopulation. Here, we use the MesKit calFst
to calculate the Fst between tumor regions.
```{R message=FALSE, fig.align='left'}
library(ggpubr) library(rstatix)
cols_samples = setNames( set.colors(6), nm = c("BRCA","READ","LNET","OvaryLM","OvaryRM","UterusM") )
scores.Fst = calFst(maf, plot = TRUE, use.tumorSampleLabel = TRUE, withinTumor = FALSE, number.cex = 10)
Fst.data = list()
for(i in names(scores.Fst) ){ Fst = scores.Fst[[i]]$Fst.pair Fst = Fst[lower.tri(Fst)]
Fst.data[[i]] = data.frame( Fst = Fst, tumor = i ) }
Fst.data = purrr::reduce(Fst.data, rbind)
stat.test = Fst.data %>% wilcox_test(Fst ~ tumor) %>% add_significance("p") %>% add_xy_position(fun ="mean") %>% dplyr::filter(group1 == "Coad" & group2 != "Lung")
p1 = Fst.data %>% ggplot(aes(x = tumor, y = Fst) ) + theme_classic2() + geom_boxplot(aes(col = tumor, shape = tumor), width = 0.6 ) + geom_jitter(aes(col = tumor, shape = tumor), width = 0.1, size = 2) + labs(x = NULL, y = "Fst" ) + scale_color_manual( values = cols_samples ) + stat_pvalue_manual(stat.test, label = "p.signif") + theme( legend.position = "none", axis.text = element_text(size = 15), axis.text.x = element_text(hjust = 1, angle = 45) )
p1
### 4.4 Search the clinically targtable sites `getClinSites()` search the clinically targetable sites. Briefly, we match the drivers genes between MAF files and clinical sites in OncoKB. We only match the gene names, whereas ignoring the cancer types and gene alterations. The main targetable alterations include (1) gene fusions (like BCR-ABL1 fusion), (2) Oncogenic mutations, (3) Exon deletions/insertion, (4) Amplifications, (5) Deletions and (6) single-nucleotide mutation (BRAC V600E). Please manually check the mutation status. Please see more information in [oncokb](oncokb.org) ```r # get all cancers sites <- getClinSites(maf) #get BRCA sites <- getClinSites(maf, Patient_ID = "BRCA") # DT::datatable(sites) sites[c(1,2,4:6),c("Hugo_Symbol", "Variant_Classification", "Tumor_Sample_Barcode", "Clonal_Status", "Level", "Alterations", "Cancer.Types", "Drugs")]
We found that the BRCA
contains one targetable site in PIK3CA
. The PIK3CA
mutation in BRCA
is clonal (shared by all sampling sites) and the corresponding drug is Alpelisib
+ Fulvestrant
in Level 1
The natural selection may driver multiple primary tumors in MPT under different selective advantages. The Ka/Ks (ratio of non-synonymous rate relative to synonymous rate), also known Dn/Ds, can be used to measure the pressure of natural selection. The Ka/Ks is informative about the selection, where Dn/Ds > 1 indicates positive selection and Dn/Ds < 1 indicates negative selection.
The function calKaKs
estimates the selective pressure in different sets of mutations, according to the mutation class defined by classifyMut
. The Ka/Ks is calculated based on dndscv.
kaks <- calKaKs(maf, patient.id = "BRCA", class = "SP", parallel = TRUE, vaf.cutoff = 0.05)
On the other hand,selective pressure can be estimated simply by the ratio of driver mutations relative to total mutation. Like calKaKs
, the function calPropDriver
estimates the selective pressure in different sets of mutations, according to the ratio of driver mutations defined by classifyMut
.
prop <- calPropDriver(maf, patient.id = "BRCA", class = "SP", driverGene = driverGene) cowplot::plot_grid(kaks$BRCA$plot + ggpubr::theme_pubr(), prop$BRCA$plot + ggpubr::theme_pubr() )
The mutations in BRCA
were divided into Public
(shared by all sampling site), Shared
(shared by two or more samples but not all sampling site), and Private
(Private in one sampling site). The Public
mutations were under positive selection (Dn/Ds >=1
and The prop of driver mutations is high
), coherent with the expectation that the ancestral mutations might contribute to cancer development.
For MPT analysis, the core purpose is to dissect the evolutionary relationships between multiple primary and metastatic tumors. The phylogenetic samples tree delineates the relationships between samples by using the genetic summary of each sample, including somatic mutations and allele-specific CNAs. In MPTevol
, the mutation-based and CNA-based trees are built by the function plotMutTree
and plotCNAtree
, respectively.
With MPTevol
, the construction of phylogenetic mutation tree is based on the binary present/absence matrix of mutations across all tumor regions.
Based on the Maf object, phyloTree()
function reconstructs phylogenetic tree in different methods, including "NJ" (Neibor-Joining) , "MP" (maximum parsimony), "ML" (maximum likelihood), "FASTME.ols" and "FASTME.bal", which can be set by method
parameter.
By group
and group.colors
parameters, the samples in the tree can be color by their histological types.
# For split1, the READ and its Metastases were indicated as one patient. data.type <- "split1" maf1 <- readMaf( mafFile = system.file(package = "MPTevol", "extdata", sprintf("meskit.%s.mutation.txt", data.type)), ccfFile = system.file(package = "MPTevol", "extdata", sprintf("meskit.%s.CCF.txt", data.type)), clinicalFile = system.file(package = "MPTevol", "extdata", sprintf("meskit.%s.clinical.txt", data.type)), refBuild = "hg19", ccf.conf.level = 0.95 )
# Set the Group Information group <- list( Coad = paste0("READ_", 1:5), OveryLM = paste0("OvaryLM_", 1:5), OveryRM = paste0("OvaryRM_", 1:6), UterusM = paste0("UterusM_", c(1, 3)) ) # set group colors group.colors <- setNames(c("#7570B3", "#E6AB02", "#003C30", "#666666"), nm = names(group)) mutTrees <- plotMutTree(maf1, patient.id = "Met1", group = group, group.colors = group.colors, title = "mutation-based Tree" ) mutTrees$plot
MPTevol
used MEDICC
to infer CNA-based samples trees. MEDICC
calculates the minimal copy number events from the ancestor(major = 1, minor = 1) to the mutant samples. The estimations are based on each chromosome and the finial genetic distances is the sum of the distances in total chromosomes.
First, we obtain shared genomic regions across samples by splitSegment
. The splitSegment
is used to generate the input format required by MEDICC. In current develop version, the input is based on the called genomic allele-specific CNAs by Sequenza
.
# Running MEDICC folder <- "/data1/qingjian/Rproject/Three/medicc/Seg.new1" segfiles <- list.files(folder, pattern = ".txt", full.names = T) sampleid <- list.files(folder, pattern = ".txt") %>% stringr::str_remove("_segments.txt") # Running for Breast library(IRanges) seg <- splitSegment( segfiles = segfiles[1:5], sampleid = sampleid[1:5], project.names = "Breast", out.dir = "medicc/Breast", N.baf = 30, cnv_min_length = 1e+05, max_CNt = 15, minLength = 1e+05, maxCNV = 4 )
We can check shared genomic regions by using heatmaps to show the CNA changes (major or minor) in total samples.
plotMediccSeg <- function(file) { major <- read.table(file, header = T) major <- major %>% mutate(seq = str_c(seqnames, start, end, sep = "_")) %>% column_to_rownames(var = "seq") %>% mutate(seqnames = NULL, start = NULL, end = NULL, width = NULL) Heatmap(major, row_names_gp = gpar(fontsize = 6) ) } # See Major and Minor plotMediccSeg("medicc/Breast/Breast.major.txt") plotMediccSeg("medicc/Breast/Breast.minor.txt")
Then we run MEDICC
in the shell. A simple run command is as follows:
```{sh eval=FALSE}
medicc=/soft/medicc/medicc.py
python=/anaconda3/envs/pyclone/bin/python
$python $medicc medicc/Breast/Breast.descr.txt medicc/Breast.out -v > medicc/Breast.runinfo.txt
After Running `MEDICC`, we get the distance file (`tree_final.dist`, generated by `MEDICC`) to plot the CNA-based trees.`plotCNAtree()` first estimates the bootstrap values of CNA-based trees by replacing tree distances. Then, `plotCNAtree()` visualzied the trees. ```r # read samples distances. # This dist file is the output of MEDICC dist <- system.file(package = "MPTevol", "extdata", "tree_final.dist") # set group information # set group information group <- list( Coad = paste0("READ_", 1:5), OveryLM = paste0("OvaryLM_", 1:5), OveryRM = paste0("OvaryRM_", 1:6), UterusM = paste0("UterusM_", c(1, 3)) ) # set group colors group.colors <- setNames(c("#7570B3", "#E6AB02", "#003C30", "#666666"), nm = names(group)) # built trees cnaTree <- plotCNAtree( dist = dist, group = group, group.colors = group.colors, title = "CNA-based Trees" ) cnaTree$plot
We compare mutation-based trees and CNA-based trees to see their discrepancy in the phylogenetic relationships.
library(patchwork) (mutTrees$plot + theme(legend.position = "none")) + (cnaTree$plot + theme(legend.position = "none"))
The READ_5
,UterusM_1
, UterusM_3
exhibit different relationships in the two trees.
To DO
: check how the mutation cutoffs influence the mutation-based trees.
The samples trees reflecting the overall genetic similarity are often suffered from the tumor admixture in bulk samples. In contrast, the clonal trees represent the clonal history among tumor cell lineage. The clonal trees can reflect the clonal dynamics from primary to metastatic tumors.
This step is to infer the clonal structures. The sciClone
4 and PyClone
5 could infer the clonal structures.
Two prominent approaches in clonal evolution studies are:
using only diploid heterozygous variants (variants in regions without copy number alteration),
hence excluding copy-altered variants. When only diploid heterozygous variants are used, VAFs can
be estimated as the ratio of the variant read count and total read count and clustering can be
performed by tools such as sciClone
.
including copy-altered variants. When copy-altered variants are included, clustering
should be performed using copy-number aware tools such as PyClone
, and copy number
corrected VAFs can be obtained by dividing the CCFs estimated by such tools by two.
In MPTevol, the format of variants
is used.
# load data data("variants", package = "MPTevol") data("variants.ref", package = "MPTevol") head(variants,3)
library(clonevol) vaf.col.names <- c( paste0("READ_", 1:5), paste0("OvaryLM_", 1:5), paste0("OvaryRM_", 1:6), paste0("UterusM_", c(1, 3)) ) sample.groups <- mapply(function(x) x[1], strsplit(vaf.col.names, "_")) names(sample.groups) <- vaf.col.names cluster.col.name <- "cluster" clones.number <- 10 clone.colors <- set.colors(10) # Check data. pp <- plotVafCluster( variants = variants, cluster.col.name = "cluster", vaf.col.names = vaf.col.names[c(1, 6, 10, 11)], # highlight = "is.driver", # highlight.note.col.name = "gene_site", box = TRUE, violin = FALSE ) pp # check cluster changes. plot.cluster.flow(variants, cluster.col.name = cluster.col.name, vaf.col.names = vaf.col.names, sample.names = vaf.col.names, colors = set.colors(clones.number), y.title = "Variant Allele Frequency %" ) + theme(axis.text.x = element_text(angle = 90))
Inferring the clonal trees is the central process in clonal construction. However, users always find it difficult to build clonal trees. Therefore, we should check the cluster structure before building clonal trees. Here are some suggestions about building clonal trees.
chose the optimal clustering methods. Before mutation clustering. we should removed low-quality mutations. It is recommended to delete mutations in the LOH regions and INDELs. The mutations in the cnv-regions should be carefully checked.
chose the right founding cluster. When the mean VAFs of founding cluster is smaller than a certain cluster, tree construction fails.
ignore some false-negative clusters. (1) The clusters with low cellular prevalence is probably clustering error, especially clusters that have low VAF in all samples. (2) small cluster. Removing clusters that having too few mutations.
try different cutoffs. The two parameters sum.p
and alpha
are used to determine whether a cluster is present in a given sample. A relaxed cutoffs (small values of the two parameters) enables more clusters are though to be present in the sample.
sel <- 1:18 y <- inferClonalTrees( project.names = "Met", variants = variants.ref, ccf.col.names = vaf.col.names[sel], sample.groups = sample.groups[sel], cancer.initiation.model = "monoclonal", founding.cluster = 1, ignore.clusters = 4, cluster.col.name = "cluster", subclonal.test.model = "non-parametric", sum.p = 0.01, alpha = 0.05, weighted = FALSE, plot.pairwise.CCF = FALSE, highlight.note.col.name = NULL, highlight = "is.driver", highlight.CCF = TRUE ) # pdf(file = sprintf("%s/%s.trees.pdf", output, output), width = 6, height = 6) plot.all.trees.clone.as.branch(y, branch.width = 0.5, node.size = 2, node.label.size = 0.5, tree.rotation = 180, angle = 20 ) # dev.off()
For cancer evolution, most models assume that all cancer samples in a case originates from a monoclonal ancestor(single-primary tumor). However, the metastatic tumors might form from multiple primary tumors, so polyclonal model (multiple founding clones) should be considered.
The merge.samples
function can merge MRS samples into a meta sample. This can be used to represent the clonal admixture of whole tumor.
Note
: (1) This function required the number of ref and var alleles for each sample. In the examples, we used the variants.ref
data sets.
(2) The CCF
ranges between 0-100.
# merge coad sel <- 1:5 y.merge <- merge.samples(y, samples = vaf.col.names[sel], new.sample = "READ", new.sample.group = "READ", ref.cols = str_c(vaf.col.names[sel], ".ref"), var.cols = str_c(vaf.col.names[sel], ".var") ) # merge overyLM sel <- 6:10 y.merge <- merge.samples(y.merge, samples = vaf.col.names[sel], new.sample = "OvaryLM", new.sample.group = "OvaryLM", ref.cols = str_c(vaf.col.names[sel], ".ref"), var.cols = str_c(vaf.col.names[sel], ".var") ) # merge overyRM sel <- c(11:13, 14, 15, 16) y.merge <- merge.samples(y.merge, samples = vaf.col.names[sel], new.sample = "OvaryRM", new.sample.group = "OvaryRM", ref.cols = str_c(vaf.col.names[sel], ".ref"), var.cols = str_c(vaf.col.names[sel], ".var") ) # merge UterusM sel <- c(17, 18) y.merge <- merge.samples(y.merge, samples = vaf.col.names[sel], new.sample = "UterusM", new.sample.group = "UterusM", ref.cols = str_c(vaf.col.names[sel], ".ref"), var.cols = str_c(vaf.col.names[sel], ".var") )
TimeScape
is an automated tool for navigating temporal clonal evolution data. The key attributes of this implementation involve the enumeration of clones, their evolutionary relationships and their shifting dynamics over time. TimeScape
requires two inputs: (i
) the clonal phylogeny and (ii
) the clonal prevalences. The output is the TimeScape plot showing clonal prevalence vertically, time horizontally, and the plot height optionally encoding tumour volume during tumour-shrinking events. At each sampling time point (denoted by a faint white line), the height of each clone accurately reflects its proportionate prevalence. These prevalences form the anchors for bezier curves that visually represent the dynamic transitions between time points
The tree2timescape
transforms the tree information into the required format of TimeScape
. Then user can view the clonal dynamic along with tumor locations or time.
library(timescape) samples <- names(y.merge$models) times <- tree2timescape(results = y.merge, samples = names(y.merge$models)) # run timescape i <- 1 timescape( clonal_prev = times$clonal_prev[[i]], tree_edges = times$tree_edges[[i]], clone_colours = times$clone_colours[[i]], genotype_position = "centre", xaxis_title = NULL ) # save svg file plot # rsvg::rsvg_pdf( svg = sprintf("%s/%s.model%s.svg", project.names, project.names, i), # file = sprintf("%s/%s.model%s.pdf", project.names, project.names, i) # )
Say something about the tree.
Metastatic routines could be estimated by comparing the CCFs of mutations from the primary and metastatic tumors. The H index is calculated as
$H=\frac{L_m}{L_p+1}$
Where $L_m$ and $L_p$ are the number of metastatic-private clonal and primary-private clonal mutations, respectively 7. The H index is positively correlated with the time of dissemination, therefore larger H index is associated with later dissemination (H >=20) 7.
The Jaccard similarity index (JSI) between the primary and metastatic tumors was calculated as
$JSI=\frac{W_s}{L_p+L_m+W_s}$
Where $W_s$ is the number of shared subclonal mutations 8. The JSI index is informative about the monoclonal (JSI <= 0.3) and polyclonal metastatic seeding (JSI >0.3) 8.
## add driver information maf_driver <- data.frame( Mut_ID = c("5:112170777:CAGA:-", "1:147092680:-:C"), is.driver = c(TRUE, TRUE) ) cal <- calRoutines( maf = maf1, patient.id = "Met1", PrimaryId = "READ", pairByTumor = TRUE, use.tumorSampleLabel = TRUE, subtitle = "both", maf_drivers = maf_driver ) wrap_plots(plotlist = cal$Met1$plist, nrow = 1)
Liu M, Chen J, Wang X et al. MesKit: a tool kit for dissecting cancer evolution of multi-region tumor biopsies through somatic alterations, Gigascience 2021;10.
Favero F, Joshi T, Marquard AM et al. Sequenza: allele-specific copy number and mutation profiles from tumor sequencing data, Ann Oncol 2015;26:64-70.
Schwarz RF, Trinh A, Sipos B et al. Phylogenetic quantification of intra-tumour heterogeneity, PLoS Comput Biol 2014;10:e1003535.
Miller CA, White BS, Dees ND et al. SciClone: inferring clonal architecture and tracking the spatial and temporal patterns of tumor evolution, PLoS Comput Biol 2014;10:e1003665.
Roth A, Khattra J, Yap D et al. PyClone: statistical inference of clonal population structure in cancer, Nat Methods 2014;11:396-398.
Dang HX, White BS, Foltz SM et al. ClonEvol: clonal ordering and visualization in cancer sequencing, Ann Oncol 2017;28:3076-3082.
Hu Z, Ding J, Ma Z et al. Quantitative evidence for early metastatic seeding in colorectal cancer, Nat Genet 2019;51:1113-1122.
Hu Z, Li Z, Ma Z et al. Multi-cancer analysis of clonality and the timing of systemic spread in paired primary tumors and metastases, Nat Genet 2020;52:701-708.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.