parseCSQToGRanges: Parse the CSQ column of a VCF object into a GRanges object

parseCSQToGRangesR Documentation

Parse the CSQ column of a VCF object into a GRanges object

Description

Parse the CSQ column in a VCF object returned from the Ensembl Variant Effect Predictor (VEP).

**This method was rescued following the deprecation of the package ensemblVEP in the Bioconductor release 3.20.**

Usage

## S4 method for signature 'VCF'
parseCSQToGRanges(x, VCFRowID=character(),
    ..., info.key = "CSQ")

Arguments

x

A VCF object.

VCFRowID

A character vector of rownames from the original VCF. When provided, the result includes a metadata column named ‘VCFRowID’ which maps the result back to the row (variant) in the original VCF.

When VCFRowID is not provided no ‘VCFRowID’ column is included.

info.key

The name of the INFO key that VEP writes the consequences to in the output (default is CSQ). This should only be used if something other that CSQ was passed in the –vcf_info_field flag in the output options.

...

Arguments passed to other methods. Currently not used.

Details

-

When ensemblVEP returns a VCF object, the consequence data are returned unparsed in the 'CSQ' INFO column. parseCSQToGRanges parses these data into a GRanges object that is expanded to match the dimension of the 'CSQ' data. Because each variant can have multiple matches, the ranges in the GRanges are repeated.

If rownames from the original VCF are provided as VCFRowID a metadata column is included in the result that maps back to the row (variant) in the original VCF. This option is only applicable when the info.key field has data (is not empty).

If no info.key column is found the function returns the data in rowRanges().

Value

Returns a GRanges object with consequence data as the metadata columns. If no 'CSQ' column is found the GRanges from rowRanges() is returned.

Author(s)

Valerie Obenchain, Kevin Rue-Albrecht

References

Ensembl VEP Home: http://uswest.ensembl.org/info/docs/tools/vep/index.html

Examples

  library(VariantAnnotation)
  file <- system.file("extdata", "moderate.vcf", package = "TVTB")
  vep <- readVcf(file)

  ## The returned 'CSQ' data are unparsed.
  info(vep)$CSQ

  ## Parse into a GRanges and include the 'VCFRowID' column.
  vcf <- readVcf(file, "hg19")
  csq <- parseCSQToGRanges(vep, VCFRowID=rownames(vcf))
  csq[1:4]

kevinrue/TVTB documentation built on July 9, 2024, 11:42 p.m.