predict_variant_effect: Add variant effect prediction

Description Usage Arguments Details Examples

Description

Experimental, use with caution

Usage

1
predict_variant_effect(cds, vcf, genome)

Arguments

cds

Coding sequence coordinates

vcf

Data frame of variants from a variant call format file

genome

A genome reference compatible with get_genomic_sequence

Details

cds, vcf, and genome should have matching chromosome naming conventions.

cds must contain the following columns (rows represent exon coordinates): tx, exon, chr, strand, start, and end

vcf must contain the following standard VCF columns (rows represent variants): CHROM, POS, REF, ALT.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
if (requireNamespace('BSgenome.Hsapiens.UCSC.hg38')) {
  library(tidyverse)
  library(mutagenesis)

  # Example files provided in this package
  cds_file <- system.file('extdata/CDS.csv', package = 'mutagenesis')
  vcf_file <- system.file('extdata/VCF.vcf', package = 'mutagenesis')

  # Coding sequence coordinates, variants, and a reference genome
  # are required to predict variant effects
  cds    <- read_csv(cds_file)
  vcf    <- read_vcf(vcf_file)
  genome <- BSgenome.Hsapiens.UCSC.hg38::Hsapiens
  vep    <- predict_variant_effect(cds, vcf, genome)

  # An example summary of effects
  vep %>%
    select(
      gene, ID:INFO, ref_cds, alt_cds, ref_aa, alt_aa, mutation_type,
      exon_boundary_dist, vcf_start, vcf_end, exon_start, exon_end
    ) %>%
    distinct() %>%
    count(ref_aa, mutation_type) %>%
    arrange(mutation_type, -n)
}

EricEdwardBryant/mutagenesis documentation built on May 14, 2019, 6:13 p.m.