SplitOneVCF: Split a VCF into SBS, DBS, and ID VCFs, plus a list of other...
In ICAMS: In-Depth Characterization and Analysis of Mutational Signatures ('ICAMS')

SplitOneVCF

R Documentation

Split a VCF into SBS, DBS, and ID VCFs, plus a list of other mutations

Description

Split a VCF into SBS, DBS, and ID VCFs, plus a list of other mutations

Usage

SplitOneVCF(
  vcf.df,
  max.vaf.diff = 0.02,
  name.of.VCF = NULL,
  always.merge.SBS = FALSE,
  chr.names.to.process = NULL
)

Arguments

`vcf.df`	An in-memory data.frame representing a VCF, including VAFs, which are added by `ReadVCF`.
`max.vaf.diff`	The maximum difference of VAF, default value is 0.02. If the absolute difference of VAFs for adjacent SBSs is bigger than `max.vaf.diff`, then these adjacent SBSs are likely to be "merely" asynchronous single base mutations, opposed to a simultaneous doublet mutation or variants involving more than two consecutive bases. Use negative value (e.g. -1) to suppress merging adjacent SBSs to DBS.
`name.of.VCF`	Name of the VCF file.
`always.merge.SBS`	If `TRUE` merge adjacent SBSs as DBSs regardless of VAFs and regardless of the value of `max.vaf.diff`.
`chr.names.to.process`	A character vector specifying the chromosome names in VCF whose variants will be kept and processed, other chromosome variants will be discarded. If `NULL`(default), all variants will be kept except those on chromosomes with names that contain strings "GL", "KI", "random", "Hs", "M", "JH", "fix", "alt".

Value

A list with 3 in-memory VCFs and discarded variants that were not incorporated into the first 3 VCFs:

* SBS: VCF with only single base substitutions.

* DBS: VCF with only doublet base substitutions.

* ID: VCF with only small insertions and deletions.

* discarded.variants: Non-NULL only if there are variants that were excluded from the analysis. See the added extra column discarded.reason for more details. @md

ICAMS documentation built on June 15, 2025, 1:08 a.m.