Description Objects from the Class Slots Methods Author(s) References Examples
ensemblGenome represents ensembl genomic annotation data.
Objects can be created by calls of the form ensemblGenome(dbfile)
.
'dbfile' represents SQLite database file.
basedir
:Object of class "character"
Directory where SQLite database is written.
ev
:Object of class "environment"
Environment that
contains data structures. Optionally, there are gtf and attr
data.frames.
signature(object = "refGenome")
:
Creates a sensible printout.
signature(object = "refGenome")
:
Returns content of gtf table.
signature(object = "refGenome")
:
Writes content of gtf table.
signature(object = "refGenome")
:
Returns content of attribute table.
signature(object = "refGenome")
:
Returns content of genes table when table exists.
Otherwise NULL is returned.
signature(object = "refGenome")
:
Writes content of attribute table.
signature(object, filename="transcripts.gtf",
sep = "\t", useBasedir=TRUE, comment.char = "#",
progress=100000L, ...)
:
Imports content of gtf file.
This is the basic mechanism for data import. It works the
same way for ucscGenome and for ensemblGenome.
signature(object="ensemblGenome")
:
Extracts all annotations on primary assembly. The function returns
a data.frame. Used as shortcut to directly extract a table from gtf
files.
signature(object="ensemblGenome")
:
Extracts annotated positions which are classified as given 'feature'
argument. Returns an 'ensemblGenome' object.
signature(object="ensemblGenome",
geneNames="character")
:
Extracts ensemblGenome object which contains table subsets.
When none of the geneNames
matches,
the function returns NULL
.
signature(object="ensemblGenome",
transcripts="character")
:
Extracts ensemblGenome object which contains table subsets
signature(object="ucscGenome",
force="logical")
:
Extracts table with position data for whole genes
(smallest exon start position and largest exon end position. A copy
of the table will be placed inside the internal environment. Upon
subsequent call only a copy of the contained table is returned
unless force=TRUE
is given. Upon force=TRUE
new gene
positions are calculated regardless of existing tables.)
signature(object="ucscGenome")
:
Returns data.frame containing gene-specific data.
signature(object="ensemblGenome")
:
Extracts table object which contains tabled 'transcript_name' column
of gtf table
signature(object="ensemblGenome")
:
Extracts table object which contains tabled 'transcript_id' column
of gtf table
signature(object = "refGenome")
:
Copies content of gtf, attr and xref table to database.
Wolfgang Kaisers
http://www.ensembl.org/info/data/ftp/index.html http://mblab.wustl.edu/GTF22.html#fields
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 | ##-------------------------------------##
## Create an instance from scratch
## Real data:
## ftp://ftp.ensembl.org/pub/release-80/gtf/homo_sapiens/Homo_sapiens.GRCh38.80.gtf.gz
##-------------------------------------##
ens <- ensemblGenome()
basedir(ens) <- system.file("extdata",package="refGenome")
ens_gtf <- "hs.ensembl.62.small.gtf"
read.gtf(ens,ens_gtf)
# Load a previously saved genome:
ensfile <- system.file("extdata", "hs.ensembl.62.small.RData", package="refGenome")
ens <- loadGenome(ensfile)
##-------------------------------------##
## Saving and loading
## Save as R-image (fast loading)
##-------------------------------------##
basedir(ens) <- getwd()
saveGenome(ens, "hs.ensembl.62.small.RData", useBasedir=FALSE)
enr <- loadGenome("hs.ensembl.62.small.RData")
## Save as SQLite database
##-------------------------------------##
## Commented out because RSQlite
## seems to produce memory leaks
##-------------------------------------##
writeDB(ens, filename="ens62.db3", useBasedir=FALSE)
edb <- loadGenomeDb(filename="ens62.db3")
##-------------------------------------##
##Extract data for Primary Assembly seqids
##-------------------------------------##
enpa <- extractSeqids(ens,ensPrimAssembly())
# Tables all features in 'gtf' table
tableFeatures(enpa)
# Extract Coding sequences for Primary Assemblys
enpafeat <- extractFeature(enpa, "exon")
# Shortcut. Returns a data.frame
engen <- extractPaGenes(ens)
##-------------------------------------##
## Extract data for indival Genes
##-------------------------------------##
ddx <- extractByGeneName(ens, "DDX11L1")
ddx
tableTranscript.id(ddx)
tableTranscript.name(ddx)
fam <- extractTranscript(ens, "ENST00000417324")
fam
# Extract range limits of entire Genes
gp <- getGenePositions(ens)
gp
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.