Query Ensembl Variant Effect Predictor

Share:

Description

Retrieve variant annotation data from the Ensembl Variant Effect Predictor (VEP).

Usage

1
2
## S4 method for signature 'character'
ensemblVEP(file, param=VEPParam(), ...)

Arguments

file

A character specifying the full path to the file, including the file name.

Valid input file types are described on the Ensembl VEP web page. http://www.ensembl.org/info/docs/variation/vep/vep_script.html#running

param

An instance of VEPParam specifying runtime options.

...

Additional arguments passed to methods.

Details

The Ensembl VEP tool is described in detail on the home page (link in 'see also' section). The ensemblVEP function wraps the perl API and requires a local install of the Ensembl VEP available in the user's path. The VEPParam class provides a way to specify runtime options. Results are returned from Ensembl VEP as GRanges (default) or VCF objects. Alternatively, results can be written directly to a file.

Value

Default behavior returns a GRanges object. Options can be set to return a VCF object or write a file to disk.

Author(s)

Valerie Obenchain

References

Ensembl VEP Home: http://www.ensembl.org/info/docs/tools/vep/index.html

Human Genome Variation Society (hgvs): http://www.hgvs.org/mutnomen/

See Also

VEPParam-class

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
  ## -----------------------------------------------------------------------
  ## Results returned as GRanges or VCF objects
  ## -----------------------------------------------------------------------
  ## The default behavior returns a GRanges with the consequence
  ## data as metadata columns.
  file <- system.file("extdata", "ex2.vcf", package="VariantAnnotation") 
  ## Not run: 
  gr <- ensemblVEP(file)
  gr[1:3]
  
## End(Not run)
  ## When the 'vcf' option is TRUE, a VCF object is returned.
  myparam <- VEPParam(dataformat=c(vcf=TRUE))
  vcf <- ensemblVEP(file, param=myparam)
  vcf
 
  ## The consequence data are returned as the 'CSQ' column in info.
  info(vcf)$CSQ
 
  ## To parse this column use parseCSQToGRanges().
  csq <- parseCSQToGRanges(vcf)
  head(csq, 4)
 
  ## The columns returned are controlled by the 'fields' option. 
  ## By default all fields are returned. See ?VEPParam for details.
 
  ## When comparing ensemblVEP() results to the data in the
  ## input vcf we see variant 20:1230237 was not returned.
  vcf_input <- readVcf(file, "hg19")
  rowRanges(vcf_input)
  rowRanges(vcf)
 
  ## This variant has no alternate allele and is called a
  ## monomorphic reference. The Ensembl VEP automatically
  ## drops these variants. 
  rowRanges(vcf)[,c("REF", "ALT")]
 
  ## -----------------------------------------------------------------------
  ## Results written to disk
  ## -----------------------------------------------------------------------
  ## Write a file to disk by providing a path and file name as 'output_file'.
  ## Different output file formats are specified using the 'dataformat' 
  ## runtime options.
 
  ## Write a vcf file to myfile.vcf:
  myparam <- VEPParam(dataformat=c(vcf=TRUE), 
                      input=c(output_file="/path/myfile.vcf"))
  ## Write a gvf file to myfile.gvf:
  myparam <- VEPParam(dataformat=c(gvf=TRUE), 
                      input=c(output_file="/path/myfile.gvf"))
 
  ## -----------------------------------------------------------------------
  ## Runtime options
  ## -----------------------------------------------------------------------
  ## All runtime options are controlled by specifying a VEPParam.
  ## See ?VEPParam for complete details.
  param <- VEPParam()
 
  ## Logical options are turned on/off with TRUE/FALSE. By
  ## default, 'quiet' is FALSE.
  basic(param)$quiet
 
  ## Setting 'quiet' to TRUE will suppress all status and warnings.
  basic(param)$quiet <- TRUE
 
  ## Characater options are turned on/off by specifying a character 
  ## value or an empty character (i.e., character()). By default no 
  ## 'sift' results are returned.
  output(param)$sift
 
  ## Setting 'sift' to 'b' will return both predictions and scores.
  output(param)$sift <- 'b'
 
  ## Return 'sift' to the original state of no results returned.
  output(param)$sift <- character()