initUTRAnnotation: Query transcripts regions and sequences from Ensembl database

Description Usage Arguments Value Examples

View source: R/run_annotation.R

Description

initUTRAnnotation query transcripts regions, UTRs and coding sequences from Ensembl database, which will be used by runUTRAnnotation to do UTR annotation.

Usage

1
2
3
4
5
6
7
8
9
initUTRAnnotation(
  variantFile,
  species,
  ensemblVersion,
  getTranscript = TRUE,
  format = "csv",
  dataDir = NULL,
  verbose = FALSE
)

Arguments

variantFile

a CSV file with Chr, Pos, Ref, Alt

species

either human or mouse

ensemblVersion

(optional) a number specifying which version of Ensembl annotation you'd like to use, by default use the latest version

getTranscript

(optional) Whether to get ids of the transcripts that overlap with all the variants. If the number of variants is too large (for example > 100,000), set it to FALSE and do this in runUTRAnnotation on each partition in parallel.

format

(optional) csv or vcf, the default is csv

dataDir

(optional) path to the store the database information, if not specified will create a folder named as input variant file name with a "db_" prefix

verbose

Whether print diagnostic messages. The default is FALSE.

Value

A variant table with Transcript column which contains the ids of the transcripts that overlap with the variants

Examples

1
2
3
4
5
test_variant_file <- system.file("extdata", "variants_sample.csv", package = "utr.annotation")
initUTRAnnotation(variantFile = test_variant_file,
                  species = "human",
                  ensemblVersion = 93,
                  dataDir = "test_db")

utr.annotation documentation built on Aug. 23, 2021, 9:06 a.m.