calculate_lesion_segregation: Calculate the amount of lesion segregation for a GRangesList...

View source: R/calculate_lesion_segretation.R

calculate_lesion_segregationR Documentation

Calculate the amount of lesion segregation for a GRangesList or GRanges object.

Description

This function calculates lesion segregation for a GRangesList or GRanges object. Lesion segregation is a large scale Watson versus Crick strand asymmetry caused by many DNA lesions occurring during a single cell cycle. It was first described in Aitken et al., 2020, Nature. See their paper for a more in-depth discussion of this phenomenon. This function can perform three different types of test to calculate lesion segregation. The first method is unique to this package, while the other two were also used by Aitken et al., 2020. The 'binomial' test is based on how often consecutive mutations are on different strands. The 'wald-wolfowitz' test checks if the strands are randomly distributed. It's not known which method is superior. The 'rl20' test looks at run sizes (The number of consecutive mutations on the same strand). This is less susceptible to local strand asymetries and kataegis, but doesn't generate a p-value.

Usage

calculate_lesion_segregation(
  vcf_list,
  sample_names,
  test = c("binomial", "wald-wolfowitz", "rl20"),
  split_by_type = FALSE,
  ref_genome = NA,
  chromosomes = NA
)

Arguments

vcf_list

GRangesList or GRanges object

sample_names

The name of the sample

test

The statistical test that should be used. Possible values: * 'binomial' Binomial test based on the number of strand switches. (Default); * 'wald-wolfowitz' Statistical test that checks if the strands are randomly distributed.; * 'rl20' Calculates rl20 value and the genomic span of the associated runs set.;

split_by_type

Boolean describing whether the lesion segregation should be calculated for all SNVs together or per 96 substitution context. (Default: FALSE)

ref_genome

BSgenome reference genome object. Only needed when split_by_type is TRUE with the binomial test or when using the rl20 test.

chromosomes

The chromosomes that are used. Only needed when using the rl20 test.

Details

The amount of lesion segregation is calculated per GRanges object. The results are then combined in a table.

It's possible to calculate the lesion segregation separately per 96 substitution context, when using the binomial test. The results are then automatically added back up together. This can increase sensitivity when a mutational process causes multiple types of base substitutions, which aren’t considered to be on the same strand.

When using the rl20 test, this function first calculates the strand runs per chromosome and combines them. It then calculates the smallest set of runs, which together encompass at least 20 percent of the mutations. (This set thus contains the largest runs). The size of the smallest run in this set is the rl20. The genomic span of the runs in this set is also calculated.

Value

A tibble containing the amount of lesions segregation per sample

See Also

plot_lesion_segregation

Other Lesion_segregation: plot_lesion_segregation()

Examples


## See the 'read_vcfs_as_granges()' example for how we obtained the
## following data:
grl <- readRDS(system.file("states/read_vcfs_as_granges_output.rds",
  package = "MutationalPatterns"
))

## To reduce the runtime we take only the first two samples
grl <- grl[1:2]
## Set the sample names
sample_names <- c("colon1", "colon2")

## Load the corresponding reference genome.
ref_genome <- "BSgenome.Hsapiens.UCSC.hg19"
library(ref_genome, character.only = TRUE)

## Calculate lesion segregation
lesion_segretation <- calculate_lesion_segregation(grl, sample_names)

## Calculate lesion segregation per 96 base type
lesion_segretation_by_type <- calculate_lesion_segregation(grl, sample_names,
  split_by_type = TRUE, ref_genome = ref_genome
)

## Calculate lesion segregation using the wald-wolfowitz test.
lesion_segregation_wald <- calculate_lesion_segregation(grl,
  sample_names,
  test = "wald-wolfowitz"
)

## Calculate lesion segregation using the rl20.
chromosomes <- paste0("chr", c(1:22, "X"))
lesion_segregation_rl20 <- calculate_lesion_segregation(grl,
  sample_names,
  test = "rl20",
  ref_genome = ref_genome,
  chromosomes = chromosomes
)

UMCUGenetics/MutationalPatterns documentation built on Nov. 24, 2022, 4:31 a.m.