AnnotateWithVEP: Annotate variants with VEP data.

View source: R/AnnotateWithVEP.R

AnnotateWithVEPR Documentation

Annotate variants with VEP data.

Description

Add variant effect predictor (VEP) annotations to a genotype object, and renames the variants with a short human-intelligible description of the variant.

Usage

AnnotateWithVEP(
  genotype,
  vep,
  min_impact = "MODERATE",
  avoid_underscores = TRUE
)

Arguments

genotype

A genotype object, typically from ReadVcf.

vep

A data frame or tibble, typically from ReadVEP, containing the fields "Uploaded_variation", "Location", "Allele", "Gene", "SYMBOL", "IMPACT", "Consequence", "Protein_position", "Amino_acids", "CLIN_SIG"

min_impact

character(1). Minimal impact included in the short descriptions of variants, among: "-"<"MODIFIER"<"LOW"<"MODERATE"<"HIGH". Default: "MODERATE"

avoid_underscores

logical. Should underscores be replaced by dashes as a field separator. Needed when the variant data will be used with Seurat, which replaces underscore in feature names by dashes. Default: TRUE.

Details

If available in the VEP data frame, adds the fields: symbol, gene_ensembl_id, impact, protein_position, aminoacids, clinicalsign, summary, and- short_description to the genotype object metadata. Also renames the variants (typically named chr-pos-ref-alt after reading a vcf) with a short description prefix of the form symbol-protein_position-aminoacids-impact. Fields for which no data is available, or impact values less than min_impact, are skipped from the short description.

VEP typically assigns several annotations for each variant in a vcf file. The one selected for annotating the genotype object is the one with the highest predicted impact. If several have the same impact, the ones with a symbol available are prioritized, and among those the ones with a protein position / amino acid change prediction. If there is still a draw between several annotations, the first one listed in the original VEP data is used.

refers to HGNC gene symbols.

Value

A genotype object with additional metadata fields (symbol gene_ensembl_id impact protein_position aminoacids clinicalsign consequence summary short_description) and annotated variant names (symbol-protein_position-aminoacids-impact-chr_pos-ref-alt).

Examples

MyGenotype <- AnnotateWithVEP(MyGenotype,MyVepDataFrame,min.impact="HIGH",avoid_underscores=TRUE)

nbroguiere/burgertools documentation built on Jan. 30, 2024, 3:48 a.m.