EnsDb-seqlevels: Support for other than Ensembl seqlevel style

seqlevelsStyleR Documentation

Support for other than Ensembl seqlevel style

Description

The methods and functions on this help page allow to integrate EnsDb objects and the annotations they provide with other Bioconductor annotation packages that base on chromosome names (seqlevels) that are different from those defined by Ensembl.

Usage


## S4 method for signature 'EnsDb'
seqlevelsStyle(x)

## S4 replacement method for signature 'EnsDb'
seqlevelsStyle(x) <- value

## S4 method for signature 'EnsDb'
supportedSeqlevelsStyles(x)

Arguments

(In alphabetic order)

value

For seqlevelsStyle<-: a character string specifying the seqlevels style that should be set. Use the supportedSeqlevelsStyle to list all available and supported seqlevel styles. As an alternative, it is also possible to submit a data.frame with custom mapping. This needs to have two columns, one of them being called "Ensembl" with the original chromosome names and another column with the new names.

x

An EnsDb instance.

Value

For seqlevelsStyle: see method description above.

For supportedSeqlevelsStyles: see method description above.

Methods and Functions

seqlevelsStyle

Get the style of the seqlevels in which results returned from the EnsDb object are encoded. By default, and internally, seqnames as provided by Ensembl are used.

The method returns a character string specifying the currently used seqlevelstyle.

seqlevelsStyle<-

Change the style of the seqlevels in which results returned from the EnsDb object are encoded. Changing the seqlevels helps integrating annotations from EnsDb objects e.g. with annotations from packages that base on UCSC annotations. The function also supports using/defining custom mappings by submitting a mapping data.frame (see examples below).

supportedSeqlevelsStyles

Lists all seqlevel styles for which mappings between seqlevel styles are available in the GenomeInfoDb package.

The method returns a character vector with supported seqlevel styles for the organism of the EnsDb object.

Note

The mapping between different seqname styles is performed based on data provided by the GenomeInfoDb package. Note that in most instances no mapping is provided for seqnames other than for primary chromosomes. By default functions from the ensembldb package return the original seqname is in such cases. This behaviour can be changed with the ensembldb.seqnameNotFound global option. For the special keyword "ORIGINAL" (the default), the original seqnames are returned, for "MISSING" an error is thrown if a seqname can not be mapped. In all other cases, the value of the option is returned as seqname if no mapping is available (e.g. setting options(ensembldb.seqnameNotFound=NA) returns an NA if the seqname is not mappable).

Author(s)

Johannes Rainer

See Also

EnsDb transcripts

Examples


library(EnsDb.Hsapiens.v86)
edb <- EnsDb.Hsapiens.v86

## Get the internal, default seqlevel style.
seqlevelsStyle(edb)

## Get the seqlevels from the database.
seqlevels(edb)

## Get all supported mappings for the organism of the EnsDb.
supportedSeqlevelsStyles(edb)

## Change the seqlevels to UCSC style.
seqlevelsStyle(edb) <- "UCSC"
seqlevels(edb)

## Change the option ensembldb.seqnameNotFound to return NA in case
## the seqname can not be mapped form Ensembl to UCSC.
options(ensembldb.seqnameNotFound = NA)

seqlevels(edb)

## Defining custom mapping for chromosome names. The `data.frame` should have
## one column named `"Ensembl"` with the original name and an additional column
## with the new names
mymap <- data.frame(Ensembl = c(4, 7, 9, 10),
                    myway = c("a", "b", "c", "d"))
seqlevelsStyle(edb) <- mymap
seqlevels(edb)

## This allows us also to rename individual chromosomes but keeping all
## original names for the others.
options(ensembldb.seqnameNotFound = "ORIGINAL")
seqlevels(edb)


jotsetung/ensembldb documentation built on Aug. 21, 2024, 11:23 a.m.