knitr::opts_chunk$set( collapse=TRUE, comment="#>" )
library(tumorHeatmap)
The R package tumorHeatmap has been developed for the Scientific Programming project. The aim of the package is to decompose individual tumors by mutational signatures and display this decomposition as a heatmap. Furthermore, clustering is performed on the tumor genomes using their exposure vectors as features. All this is displayed in a single plot presenting the heatmap of the exposure vectors (each signature for each genome) and the deondgram resulting from the clustering. The source code can be found on Github
To install the package devtools is needed. It can be installed in the following way
install.packages("devtools")
Then the package tumorHeatmap can be installed from Github or from the source package.
To install the package from Github the following code could be used.
devtools::install_github("oirot/tumorHeatmap")
If you have already downloaded the source package you could install it by
devtools::install(path_to_package)
The package needs to type of data:
The somatic mutation data needs to be in an Mutation Position Format (MPF). The MPF is as a tab separated format in which the information on the columns are:
An example of such file is
| pts | chr | pos | ref | alt | | :---: | :---: | :---: | :---: | :---: | | p1 |chr2 | 4548 | C | G | | p2 | chr3 | 564 | A | T | | p2 | chr3 | 8887 | C | T |
For more information about this data format look here
Mutational signatures are a charachteristic combination of mutations that arise from a specific mutational process (eg. DNA duplication or DNA repair defects) More information about what a signature is can be found here
The signatures can be calculated using different models for the mutation and stored in different formats. The two used are Shiraishi-type signatures and Alexandrov-type signatures.
This type of mutational signature is described by Ludmil B. Alexandrov et al.
In the first instance, we extracted mutational signatures using base substitutions and additionally included information on the sequence context of each mutation. Since there are six classes of base substitution C>A, C>G, C>T, T>A, T>C, T>G (all substitutions are referred to by the pyrimidine of the mutated Watson-Crick base pair) and since we incorporated information on the bases immediately 5′ and 3′ to each mutated base, there are 96 possible mutations in this classification. This 96 substitution classification is particularly useful for distinguishing mutational signatures which cause the same substitutions but in different sequence contexts
_Alexandrov, L. B., Nik-Zainal, S., Wedge, D. C., Aparicio, S. A., Behjati, S., Biankin, A. V., Bignell, G. R., Bolli, N., Borg, A., Børresen-Dale, A. L., Boyault, S., Burkhardt, B., Butler, A. P., Caldas, C., Davies, H. R., Desmedt, C., Eils, R., Eyfjörd, J. E., Foekens, J. A., Greaves, M., … Stratton, M. R. (2013). Signatures of mutational processes in human cancer. Nature, 500(7463), 415–421. https://doi.org/10.1038/nature12477 _
It considers the frequency of mutation of each one of the 96 possible substitution that are constitued by a nucleotide substitution and its flanking bases. The number of flanking basis can be higher but its computational cost its high.
An example of such signatures can be found in the package decompTumor2Sig
.
#loading the default Alexandrov signatures signatures <- decompTumor2Sig::readAlexandrovSignatures() head(signatures[1])
There are different format to save those kind of signatures, please use the
signatureTypes$alexandrov2
for Alexandrov V2 signatures and
signatureTypes$alexandrov32
for Alexandrov V3.2 signatures. A more detail
inquiry about the format can be found at the
COSMIC website
The Shiraishi-type signature is described in the following way
In brief, we first simplify the modelling of mutation signatures by decomposing them into separate “mutation features”. For example, the substitution type is one feature; flanking bases are each another feature. We then exploit this decomposition by using a probabilistic model for signatures that assumes independence across features. This approach substantially reduces the number of parameters associated with each signature, greatly facilitating the incorporation of additional relevant sequence context
_Shiraishi Y, Tremmel G, Miyano S, Stephens M (2015) A Simple Model-Based Approach to Inferring and Visualizing Cancer Mutation Signatures. PLoS Genet 11(12): e1005657. https://doi.org/10.1371/journal.pgen.1005657 _
In this case each feature is independent from one another and we have a lower cost of extracting the mutational profile from the genome.
An example of such signatures can be found in the package decompTumor2Sig
.
# take the example signature flat files provided with decompTumor2Sig sigfiles <- system.file("extdata", paste0("Nik-Zainal_PMID_22608084-pmsignature-sig",1:4,".tsv"), package="decompTumor2Sig") # read the signature flat files signatures <- decompTumor2Sig::readShiraishiSignatures(files=sigfiles) head(signatures[1])
The workflow is quite simple and is based on a single function tumorHeatmap
.
A simple example of usage with the Shiraishi-type signatures provided by the
package decompTumor2Sig
and the tumoral genomes provided by the same package.
The tumoral genomes are from breast cancer genomes from Nik-Zainal et al
(PMID: 22608084). For more information about the data used check
out the vignette of decompTumor2Sig
#read breast cancer genomes from Nik-Zainal et al (PMID: 22608084) gfile <- system.file("extdata", "Nik-Zainal_PMID_22608084-MPF.txt.gz", package="decompTumor2Sig") # get the filenames with the shiraishi signatures sigfiles <- system.file("extdata", paste0("Nik-Zainal_PMID_22608084-pmsignature-sig", 1:4,".tsv"), package="decompTumor2Sig") # compute the exposure vectors and plot the heatmap exposures <- tumorHeatmap(gfile, sigfiles, signaturesType=signatureTypes$shiraishi)
A more advanced example using Alexandrov-type V3.2 signatures and the same
genomes is as follow. The signature are provided by COSMIC and can
be found at this link. They are
also provided with this package.
The paramters numBases
and trDir
are used to match the signature type
that uses 3 bases and has no information about the directionality of the strand.
#read breast cancer genomes from Nik-Zainal et al (PMID: 22608084) gfile <- system.file("extdata", "Nik-Zainal_PMID_22608084-MPF.txt.gz", package="decompTumor2Sig") sigfile <- system.file("extdata", "COSMIC_v3.2_SBS_GRCh37.txt", package="tumorHeatmap") # compute the exposure vectors and plot the heatmap exposures <- tumorHeatmap(gfile, sigfile, signaturesType=signatureTypes$alexandrov32, numBases=3, trDir=FALSE)
You could also pass a compatible signature object as described in the vignette
of the package decompTumor2Sig
instead of a path to the file.
If you do not want to use the human hg19 assembly as a refernce genome and the
UCSC.hg19.knownGene annotation you can change them using the parameters
refGenome
and transcriptAnno
, respectively.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.