knitr::opts_chunk$set(
  collapse=TRUE,
  comment="#>"
)
library(tumorHeatmap)

Introduction

The R package tumorHeatmap has been developed for the Scientific Programming project. The aim of the package is to decompose individual tumors by mutational signatures and display this decomposition as a heatmap. Furthermore, clustering is performed on the tumor genomes using their exposure vectors as features. All this is displayed in a single plot presenting the heatmap of the exposure vectors (each signature for each genome) and the deondgram resulting from the clustering. The source code can be found on Github

Installing and loading the package

Installation

To install the package devtools is needed. It can be installed in the following way

install.packages("devtools")

Then the package tumorHeatmap can be installed from Github or from the source package.

Installing from Github

To install the package from Github the following code could be used.

devtools::install_github("oirot/tumorHeatmap")

Installing from the source package

If you have already downloaded the source package you could install it by

devtools::install(path_to_package)

Input data

The package needs to type of data:

Somatic mutation data

The somatic mutation data needs to be in an Mutation Position Format (MPF). The MPF is as a tab separated format in which the information on the columns are:

An example of such file is

| pts | chr | pos | ref | alt | | :---: | :---: | :---: | :---: | :---: | | p1 |chr2 | 4548 | C | G | | p2 | chr3 | 564 | A | T | | p2 | chr3 | 8887 | C | T |

For more information about this data format look here

Signatures

Mutational signatures are a charachteristic combination of mutations that arise from a specific mutational process (eg. DNA duplication or DNA repair defects) More information about what a signature is can be found here

The signatures can be calculated using different models for the mutation and stored in different formats. The two used are Shiraishi-type signatures and Alexandrov-type signatures.

Alexandrov-type signatures

This type of mutational signature is described by Ludmil B. Alexandrov et al.

In the first instance, we extracted mutational signatures using base substitutions and additionally included information on the sequence context of each mutation. Since there are six classes of base substitution C>A, C>G, C>T, T>A, T>C, T>G (all substitutions are referred to by the pyrimidine of the mutated Watson-Crick base pair) and since we incorporated information on the bases immediately 5′ and 3′ to each mutated base, there are 96 possible mutations in this classification. This 96 substitution classification is particularly useful for distinguishing mutational signatures which cause the same substitutions but in different sequence contexts

_Alexandrov, L. B., Nik-Zainal, S., Wedge, D. C., Aparicio, S. A., Behjati, S., Biankin, A. V., Bignell, G. R., Bolli, N., Borg, A., Børresen-Dale, A. L., Boyault, S., Burkhardt, B., Butler, A. P., Caldas, C., Davies, H. R., Desmedt, C., Eils, R., Eyfjörd, J. E., Foekens, J. A., Greaves, M., … Stratton, M. R. (2013). Signatures of mutational processes in human cancer. Nature, 500(7463), 415–421. https://doi.org/10.1038/nature12477 _

It considers the frequency of mutation of each one of the 96 possible substitution that are constitued by a nucleotide substitution and its flanking bases. The number of flanking basis can be higher but its computational cost its high.

An example of such signatures can be found in the package decompTumor2Sig.

#loading the default Alexandrov signatures
signatures <- decompTumor2Sig::readAlexandrovSignatures()

head(signatures[1])

There are different format to save those kind of signatures, please use the signatureTypes$alexandrov2 for Alexandrov V2 signatures and signatureTypes$alexandrov32 for Alexandrov V3.2 signatures. A more detail inquiry about the format can be found at the COSMIC website

Shiraishi-type signatures

The Shiraishi-type signature is described in the following way

In brief, we first simplify the modelling of mutation signatures by decomposing them into separate “mutation features”. For example, the substitution type is one feature; flanking bases are each another feature. We then exploit this decomposition by using a probabilistic model for signatures that assumes independence across features. This approach substantially reduces the number of parameters associated with each signature, greatly facilitating the incorporation of additional relevant sequence context

_Shiraishi Y, Tremmel G, Miyano S, Stephens M (2015) A Simple Model-Based Approach to Inferring and Visualizing Cancer Mutation Signatures. PLoS Genet 11(12): e1005657. https://doi.org/10.1371/journal.pgen.1005657 _

In this case each feature is independent from one another and we have a lower cost of extracting the mutational profile from the genome.

An example of such signatures can be found in the package decompTumor2Sig.

# take the example signature flat files provided with decompTumor2Sig
sigfiles <- system.file("extdata",
                 paste0("Nik-Zainal_PMID_22608084-pmsignature-sig",1:4,".tsv"),
                 package="decompTumor2Sig")

# read the signature flat files
signatures <- decompTumor2Sig::readShiraishiSignatures(files=sigfiles)
head(signatures[1])

Workflow

The workflow is quite simple and is based on a single function tumorHeatmap. A simple example of usage with the Shiraishi-type signatures provided by the package decompTumor2Sig and the tumoral genomes provided by the same package. The tumoral genomes are from breast cancer genomes from Nik-Zainal et al (PMID: 22608084). For more information about the data used check out the vignette of decompTumor2Sig

  #read breast cancer genomes from Nik-Zainal et al (PMID: 22608084) 
  gfile <- system.file("extdata", "Nik-Zainal_PMID_22608084-MPF.txt.gz", 
                       package="decompTumor2Sig")

  # get the filenames with the shiraishi signatures 
  sigfiles <- system.file("extdata",
                          paste0("Nik-Zainal_PMID_22608084-pmsignature-sig",
                                 1:4,".tsv"),
                          package="decompTumor2Sig")

  # compute the exposure vectors and plot the heatmap
  exposures <- tumorHeatmap(gfile, sigfiles, 
                            signaturesType=signatureTypes$shiraishi)

A more advanced example using Alexandrov-type V3.2 signatures and the same genomes is as follow. The signature are provided by COSMIC and can be found at this link. They are also provided with this package. The paramters numBases and trDir are used to match the signature type that uses 3 bases and has no information about the directionality of the strand.

  #read breast cancer genomes from Nik-Zainal et al (PMID: 22608084) 
  gfile <- system.file("extdata", "Nik-Zainal_PMID_22608084-MPF.txt.gz", 
                       package="decompTumor2Sig")

  sigfile <- system.file("extdata", "COSMIC_v3.2_SBS_GRCh37.txt", 
                       package="tumorHeatmap")

  # compute the exposure vectors and plot the heatmap
  exposures <- tumorHeatmap(gfile, sigfile, 
                            signaturesType=signatureTypes$alexandrov32,
                            numBases=3,
                            trDir=FALSE)

You could also pass a compatible signature object as described in the vignette of the package decompTumor2Sig instead of a path to the file.

If you do not want to use the human hg19 assembly as a refernce genome and the UCSC.hg19.knownGene annotation you can change them using the parameters refGenome and transcriptAnno, respectively.



oirot/tumorHeatmap documentation built on Dec. 22, 2021, 4:17 a.m.