adjustSignaturesForRegionSet: Adjust (normalize) signatures for a set of genomic regions.
In rmpiro/decompTumor2Sig: Decomposition of individual tumors into mutational signatures by signature refitting

View source: R/adjustSignaturesForRegionSet.R

adjustSignaturesForRegionSet

R Documentation

Adjust (normalize) signatures for a set of genomic regions.

Description

'adjustSignaturesForRegionSet()' takes a set of signatures that have been orginally defined with respect to the nucleotide frequencies within a specific reference genome or region (e.g., by deriving them from whole genome mutation data) and adjusts or normalizes them to the often different nucleotide frequencies of another specific subset of genomic regions.

Usage

adjustSignaturesForRegionSet(signatures,
regionsTarget, regionsOriginal=NULL, 
refGenome=BSgenome.Hsapiens.UCSC.hg19::BSgenome.Hsapiens.UCSC.hg19)

Arguments

`signatures`	(Mandatory) Signatures to be adjusted to the nucleotide frequencies of the genomic regions defined by the parameter `regions`.
`regionsTarget`	(Mandatory) `GRanges` object defining a subset of the genome (i.e., a set of genomic regions) for which the signatures need to be adjusted (can be set to `NULL` for the whole genome).
`regionsOriginal`	(Optional) `GRanges` object defining the subset of the genome (i.e., set of genomic regions) from which the signatures where originally derived. Default: `NULL` (whole genome).
`refGenome`	(Optional) Reference genome sequence from which to compute the nucleotide frequencies. Default: `BSgenome.Hsapiens.UCSC.hg19::BSgenome.Hsapiens.UCSC.hg19`.

Details

This may be useful, for example, to perform signature refitting (using decomposeTumorGenomes) for mutation data from targetted sequencing (e.g., only a subset of genes), whole exome sequencing (only exonic regions), or other limited subsets of the genome with particular nucleotide frequencies.

For Alexandrov-type signatures, the important frequencies are those of the whole sequence patterns (e.g., trinucleotides) whose central base can be mutated. Therefore, adjustment factors for individual mutation types (e.g., A[C>T]G) are computed by comparing the corresponding sequence pattern frequencies (e.g., ACG) between the original reference regions (e.g., whole genome) and the target regions (e.g., target regions of whole exome sequencing).

In the Shiraishi-type signature model, individual bases of the sequence patterns are considered as independent features. Thus, to compute nucleotide frequencies for such signatures, the frequencies of the sequence patterns (computed as for Alexandrov-type signatures) are broken down to single-nucleotide frequencies for the individual positions of the patterns.

In both cases, after the appropriate adjustment of individual features, signatures are re-normalized such that overall probabilites sum up to 1.

Value

A set of adjusted mutational signatures in the same format as those specified for signatures.

Author(s)

Rosario M. Piro
Politecnico di Milano
Maintainer: Rosario M. Piro
E-Mail: <rmpiro@gmail.com> or <rosariomichael.piro@polimi.it>

References

http://rmpiro.net/decompTumor2Sig/
Krueger, Piro (2019) decompTumor2Sig: Identification of mutational signatures active in individual tumors. BMC Bioinformatics 20(Suppl 4):152.

Examples


### get Alexandrov signatures from COSMIC
signatures <- readAlexandrovSignatures()

### get gene annotation for the default reference genome (hg19)
library(TxDb.Hsapiens.UCSC.hg19.knownGene)
txdb <- TxDb.Hsapiens.UCSC.hg19.knownGene::TxDb.Hsapiens.UCSC.hg19.knownGene

### get a GRanges object for gene promoters (-2000 to +200 bases from TSS)
### [taking only the first 1000 for testing purpose]
library(GenomicRanges)
regionsTarget <- promoters(txdb, upstream=2000, downstream=200)[seq(1000)]

### assume these signatures were derived only from mutation data from
### exons on chromosome X [not true; just for illustrative purpose]
filter <- list(tx_chrom = c("chrX"))
regionsOriginal <- exons(txdb, filter=filter)

### adjust signatures according to nucleotide frequencies in the target
### subset of the genome
sign_adj <- adjustSignaturesForRegionSet(signatures, regionsTarget,
                                                     regionsOriginal)

rmpiro/decompTumor2Sig documentation built on May 15, 2022, 3:27 a.m.