initUTRAnnotation: Query transcripts regions and sequences from Ensembl database
In utr.annotation: Annotate Variants in the Untranslated Regions

Description Usage Arguments Value Examples

initUTRAnnotation query transcripts regions, UTRs and coding sequences from Ensembl database, which will be used by runUTRAnnotation to do UTR annotation.

initUTRAnnotation(
  variantFile,
  species,
  ensemblVersion,
  getTranscript = TRUE,
  format = "csv",
  dataDir = NULL,
  verbose = FALSE
)

`variantFile`	a CSV file with Chr, Pos, Ref, Alt
`species`	either human or mouse
`ensemblVersion`	(optional) a number specifying which version of Ensembl annotation you'd like to use, by default use the latest version
`getTranscript`	(optional) Whether to get ids of the transcripts that overlap with all the variants. If the number of variants is too large (for example > 100,000), set it to FALSE and do this in runUTRAnnotation on each partition in parallel.
`format`	(optional) csv or vcf, the default is csv
`dataDir`	(optional) path to the store the database information, if not specified will create a folder named as input variant file name with a "db_" prefix
`verbose`	Whether print diagnostic messages. The default is FALSE.

A variant table with Transcript column which contains the ids of the transcripts that overlap with the variants

test_variant_file <- system.file("extdata", "variants_sample.csv", package = "utr.annotation")
initUTRAnnotation(variantFile = test_variant_file,
                  species = "human",
                  ensemblVersion = 93,
                  dataDir = "test_db")