annotate_VCF: annotate_VCF

View source: R/annotate_VCF_functions.R

annotate_VCFR Documentation

annotate_VCF

Description

Function to add strand, gene and COSMIC mutation category annotations to a VCF. This function can take a long time depending on the number of mutations and how many of the annotation options you have selected. Please be patient!

Usage

annotate_VCF(
  vcf = vcf,
  add_strand_and_SBS_cats = T,
  add_DBS_cats = T,
  add_ID_cats = F,
  ref_fasta = NULL,
  ref_genome = BSgenome.Hsapiens.UCSC.hg19,
  palimpdir = NA,
  GRCh37_fasta = FALSE
)

Arguments

vcf

Input VCF to annotate.

add_strand_and_SBS_cats

Logical indicating whether or not strand, gene and SBS category annotations are to be added (defaults to TRUE).

add_DBS_cats

Logical indicating whether or not DBS category annotations are to be added (defaults to TRUE).

add_ID_cats

Logical indicating whether or not Indel category annotations are to be added (defaults to FALSE). Unfortunately Indel mutation categories cannot be added to the VCF in Windows, as this R function calls a python script. Please run this step in a unix environment (Mac/Linux etc.).

ref_fasta

File path to FASTA file compatable with input VCF positions and chromosomes. Only required when add_ID_cats = TRUE. The latest reference genomes in FASTA format can be downloaded here

ref_genome

Name of reference genome object. For hg19 data we use the BSgenome.Hsapiens.UCSC.hg19 object, which is loaded into the local environment by library(BSgenome.Hsapiens.UCSC.hg19). Use library(BSgenome.Hsapiens.UCSC.hg38) as appropirate.

palimpdir

If you received a filepath error when adding indel categories, set this parameter as a filepath to the location of the Palimpsest package directory that you downloaed from our GitHub

GRCh37_fasta

If you received a VCF/FASTA error when adding indel categories, and you are working with GRCh37 data (or your hg19 FASTA has no 'chr' prefixes) set to TRUE.

Value

vcf

Examples

vcf <- annotate_VCF(vcf = vcf, ref_genome = BSgenome.Hsapiens.UCSC.hg19, ref_fasta = "~/Documents/Data/Genomes/hg19.fa")

FunGeST/Palimpsest documentation built on June 2, 2024, 4:21 a.m.