ensemblGenome-class: Class '"ensemblGenome"'

Description Objects from the Class Slots Methods Author(s) References Examples

Description

ensemblGenome represents ensembl genomic annotation data.

Objects from the Class

Objects can be created by calls of the form ensemblGenome(dbfile). 'dbfile' represents SQLite database file.

Slots

basedir:

Object of class "character" Directory where SQLite database is written.

ev:

Object of class "environment" Environment that contains data structures. Optionally, there are gtf and attr data.frames.

Methods

show

signature(object = "refGenome"): Creates a sensible printout.

getGtf

signature(object = "refGenome"): Returns content of gtf table.

setGtf

signature(object = "refGenome"): Writes content of gtf table.

getAttr

signature(object = "refGenome"): Returns content of attribute table.

getGeneTable

signature(object = "refGenome"): Returns content of genes table when table exists. Otherwise NULL is returned.

setAttr

signature(object = "refGenome"): Writes content of attribute table.

read.gtf

signature(object, filename="transcripts.gtf", sep = "\t", useBasedir=TRUE, comment.char = "#", progress=100000L, ...): Imports content of gtf file. This is the basic mechanism for data import. It works the same way for ucscGenome and for ensemblGenome.

extractPaGenes

signature(object="ensemblGenome"): Extracts all annotations on primary assembly. The function returns a data.frame. Used as shortcut to directly extract a table from gtf files.

extractFeature

signature(object="ensemblGenome"): Extracts annotated positions which are classified as given 'feature' argument. Returns an 'ensemblGenome' object.

extractByGeneName

signature(object="ensemblGenome", geneNames="character"): Extracts ensemblGenome object which contains table subsets. When none of the geneNames matches, the function returns NULL.

extractTranscript

signature(object="ensemblGenome", transcripts="character"): Extracts ensemblGenome object which contains table subsets

getGenePositions

signature(object="ucscGenome", force="logical"): Extracts table with position data for whole genes (smallest exon start position and largest exon end position. A copy of the table will be placed inside the internal environment. Upon subsequent call only a copy of the contained table is returned unless force=TRUE is given. Upon force=TRUE new gene positions are calculated regardless of existing tables.)

getGeneTable

signature(object="ucscGenome"): Returns data.frame containing gene-specific data.

tableTranscript.name

signature(object="ensemblGenome"): Extracts table object which contains tabled 'transcript_name' column of gtf table

tableTranscript.id

signature(object="ensemblGenome"): Extracts table object which contains tabled 'transcript_id' column of gtf table

writeDB

signature(object = "refGenome"): Copies content of gtf, attr and xref table to database.

Author(s)

Wolfgang Kaisers

References

http://www.ensembl.org/info/data/ftp/index.html http://mblab.wustl.edu/GTF22.html#fields

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
##-------------------------------------##
## Create an instance from scratch
## Real data:
## ftp://ftp.ensembl.org/pub/release-80/gtf/homo_sapiens/Homo_sapiens.GRCh38.80.gtf.gz
##-------------------------------------##
ens <- ensemblGenome()
basedir(ens) <- system.file("extdata",package="refGenome")
ens_gtf <- "hs.ensembl.62.small.gtf"
read.gtf(ens,ens_gtf)
# Load a previously saved genome:
ensfile <- system.file("extdata", "hs.ensembl.62.small.RData", package="refGenome")
ens <- loadGenome(ensfile)

##-------------------------------------##
## Saving and loading
## Save as R-image (fast loading)
##-------------------------------------##

basedir(ens) <- getwd()
saveGenome(ens, "hs.ensembl.62.small.RData", useBasedir=FALSE)
enr <- loadGenome("hs.ensembl.62.small.RData")


## Save as SQLite database
##-------------------------------------##
## Commented out because RSQlite
## seems to produce memory leaks
##-------------------------------------##

writeDB(ens, filename="ens62.db3", useBasedir=FALSE)
edb <- loadGenomeDb(filename="ens62.db3")


##-------------------------------------##
##Extract data for Primary Assembly seqids
##-------------------------------------##
enpa <- extractSeqids(ens,ensPrimAssembly())
# Tables all features in 'gtf' table
tableFeatures(enpa)
# Extract Coding sequences for Primary Assemblys
enpafeat <- extractFeature(enpa, "exon")
# Shortcut. Returns a data.frame
engen <- extractPaGenes(ens)

##-------------------------------------##
## Extract data for indival Genes
##-------------------------------------##
ddx <- extractByGeneName(ens, "DDX11L1")
ddx
tableTranscript.id(ddx)
tableTranscript.name(ddx)
fam <- extractTranscript(ens, "ENST00000417324")
fam
# Extract range limits of entire Genes
gp <- getGenePositions(ens)
gp

refGenome documentation built on May 23, 2019, 1:03 a.m.