makeGRangesFromEnsembl: Make GRanges from Ensembl

Description Usage Arguments Details Value Functions Broad class definitions GRCh37 (hg19) legacy annotations AnnotationHub queries Note See Also Examples

View source: R/makeGRangesFromEnsembl.R

Description

Quickly obtain gene and transcript annotations from Ensembl using AnnotationHub and ensembldb.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
makeGRangesFromEnsembl(
  organism,
  level = c("genes", "transcripts"),
  genomeBuild = NULL,
  release = NULL,
  ignoreTxVersion = TRUE
)

annotable(
  organism,
  level = c("genes", "transcripts"),
  genomeBuild = NULL,
  release = NULL,
  ignoreTxVersion = TRUE
)

Arguments

organism

character(1). Full Latin organism name (e.g. "Homo sapiens").

level

character(1). Return as genes or transcripts.

genomeBuild

character(1). Ensembl genome build assembly name (e.g. "GRCh38"). If set NULL, defaults to the most recent build available. Note: don't pass in UCSC build IDs (e.g. "hg38").

release

integer(1). Ensembl release version (e.g. 90). If set NULL, defaults to the most recent release available.

ignoreTxVersion

logical(1). Don't the include the transcript version in the identifier. Only applies when level = "transcripts". This simplifies identifier matching when generating a tx2gene file.

Details

Simply specify the desired organism, using the full latin name. For example, we can obtain human annotations with Homo sapiens. Optionally, specific Ensembl genome builds (e.g. GRCh38) and release versions (e.g. 87) are supported.

Under the hood, this function fetches annotations from AnnotationHub using the ensembldb package. AnnotationHub supports versioned Ensembl releases, back to version 87.

Genome build: use "GRCh38" instead of "hg38" for the genome build, since we're querying Ensembl and not UCSC.

Value

GRanges.

Functions

Broad class definitions

For gene and transcript annotations, a broadClass column is added, which generalizes the gene types into a smaller number of semantically-meaningful groups:

GRCh37 (hg19) legacy annotations

makeGRangesFromEnsembl() supports the legacy Homo sapiens GRCh37 (release 75) build by internally querying the EnsDb.Hsapiens.v75 package. Alternatively, the corresponding GTF/GFF file can be loaded directly from GENCODE or Ensembl.

AnnotationHub queries

Here's how to perform manual, customized AnnotationHub queries.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
library(AnnotationHub)
library(ensembldb)
ah <- AnnotationHub()

# Human ensembldb (EnsDb) records.
ahs <- query(
    x = ah,
    pattern = c(
        "Homo sapiens",
        "GRCh38",
        "Ensembl",
        "EnsDb"
    )
)
mcols(ahs)
print(ahs)
# EnsDb (Ensembl GRCh38 94; 2018-10-11)
ah[["AH64923"]]

# Human UCSC TxDb records.
ahs <- query(
    x = ah,
    pattern = c(
        "Homo sapiens",
        "UCSC",
        "TxDb",
        "knownGene"
    )
)
mcols(ahs)
print(ahs)
# TxDb (UCSC hg38 GENCODE 24; 2016-12-22)
ah[["AH52260"]]

Note

Updated 2019-08-21.

See Also

Examples

1
2
3
4
5
6
7
## Genes
x <- makeGRangesFromEnsembl("Homo sapiens", level = "genes")
summary(x)

## Transcripts
x <- makeGRangesFromEnsembl("Homo sapiens", level = "transcripts")
summary(x)

acidgenomics/freerange documentation built on Jan. 8, 2020, 3:45 a.m.