MutationTimeR

MutationTimeR is an R package to time somatic mutations relative to clonal and subclonal copy number states and calculate the relative timing of copy number gains. Time is measured as a fraction of point mutations; this is termed mutation time. Mutation time is proportional to real time if the number of mutations acquired per bp and per year is constant.

MutationTimeR has been used by the PCAWG consortium to calculate mutation times of 2,778 whole genome sequencing samples. Please see M. Gerstung, C. Jolly, I. Leshchiner, S. Dentro, S. Gonzalez et al., The Evolutionary History of 2,658 Cancers, Nature. 578, pages 122-128(2020).

Installation

MutationTimeR runs in most current R versions. You can install and load MutationTimeR using ```{R, eval=FALSE} devtools::install_github("mg14/mg14") devtools::install_github("gerstung-lab/MutationTimeR")

## Input data
MutationTimeR requires a `vcf` object containing point mutations (SNVs, MNVs, or indels) and a `GRanges` object with the copy number segments of the sample.

```{R}
library("MutationTimeR")
#vcf <- readVcf("myvcf.vcf") # Point mutations, needs `geno` entries `AD` and `DP` or `info` columns t_alt_count t_ref_count.
#bb <- GRanges(, major_cn= , minor_cn=, clonal_frequency=purity) # Copy number segments, needs columns  major_cn, minor_cn and clonal_frequency of each segment
#clusters <- data.frame(cluster= , n_ssms=, proportion=) # Optional data.frame with subclonal cluster locations (VAF proportion) and size (number of variants n_ssms)

Here we cut this short by

data(MutationTimeR)

The vcf object containing all point mutations is:

vcf

The GRanges with allele-specific copy number data

bb

Lastly, the prevalence and number of subclonal is

clusters

Running MutationTimeR

To run MutationTimeR simply use

mt <- mutationTime(vcf, bb, clusters=clusters, n.boot=10)

Annotation of point mutations

MutationTimer produces two main outputs. The first is the probability for each individual point mutation from the original vcf object to belong to different copy number levels:

head(mt$V)

These probabilities are the basis for the following simple clonal states (early clonal/late clonal/clonal/subclonal/NA)

table(mt$V$CLS)

To add this annotation to the vcf use

vcf <- addMutTime(vcf, mt$V)

Timings of copy number gains

The proportion of point mutations in different copy number states, in turn, defines the molecular timing of gains. These are stored in the following DataFrame:

head(mt$T)

The relevant columns are time with 95% confidence intervals time.lo and time.hi as well as the counterparts time2/time2.lo/time2.hi for the second gain in cases where one allele has more than one gained copy. The field time.star indicates a tiers: *** indicates gains +1. ** gains +2, which are found to be slightly less reliable and need certain assumptions about their temporal sequence. * are subclonal gains which are hit an miss.

The DataFrame can be added to the copy number GRanges object for convenience.

mcols(bb) <- cbind(mcols(bb),mt$T)

Plot output

Timing annotated VCF and copy number can be plotted using the following command:

{R, fig.width=6, fig.height=6, fig.width=6} plotSample(vcf,bb)

This shows the observed and expected variant allele frequencies of point mutations on the top. This is very useful to spot inconsistencies with purity and copy number configuration. As a rule of thumb the states (horizontal bars) should run right through the middle of the clouds of point mutations. Colours indicate the timing category: Blue = clonal [other], purple = clonal [late], green = clonal [early], red = subclonal.

The middle plot shows the copy number as stacked barplots. Subclonal CN is indicated by fractional bars. Dark grey is major, light grey minor allele.

The bottom plot shows the estimated mutation time of primary and secondary gains (shaded). Boxes denote 95% CIs. The histogram at the right shows the distribution of timing events. Blue = mono-allelic gains (N:1), pink = CN-LOH/gain+loss (N:0) and green = bi-allelic gains (N:2).



gerstung-lab/MutationTimeR documentation built on Dec. 12, 2020, 9:29 p.m.