Description Objects from the Class Slots Methods Author(s) References Examples
ensemblGenome represents ensembl genomic annotation data.
Objects can be created by calls of the form ensemblGenome(dbfile).
'dbfile' represents SQLite database file.
basedir:Object of class "character"
Directory where SQLite database is written.
ev:Object of class "environment" Environment that
contains data structures. Optionally, there are gtf and attr
data.frames.
signature(object = "refGenome"):
Creates a sensible printout.
signature(object = "refGenome"):
Returns content of gtf table.
signature(object = "refGenome"):
Writes content of gtf table.
signature(object = "refGenome"):
Returns content of attribute table.
signature(object = "refGenome"):
Returns content of genes table when table exists.
Otherwise NULL is returned.
signature(object = "refGenome"):
Writes content of attribute table.
signature(object, filename="transcripts.gtf",
sep = "\t", useBasedir=TRUE, comment.char = "#",
progress=100000L, ...):
Imports content of gtf file.
This is the basic mechanism for data import. It works the
same way for ucscGenome and for ensemblGenome.
signature(object="ensemblGenome"):
Extracts all annotations on primary assembly. The function returns
a data.frame. Used as shortcut to directly extract a table from gtf
files.
signature(object="ensemblGenome"):
Extracts annotated positions which are classified as given 'feature'
argument. Returns an 'ensemblGenome' object.
signature(object="ensemblGenome",
geneNames="character"):
Extracts ensemblGenome object which contains table subsets.
When none of the geneNames matches,
the function returns NULL.
signature(object="ensemblGenome",
transcripts="character"):
Extracts ensemblGenome object which contains table subsets
signature(object="ucscGenome",
force="logical"):
Extracts table with position data for whole genes
(smallest exon start position and largest exon end position. A copy
of the table will be placed inside the internal environment. Upon
subsequent call only a copy of the contained table is returned
unless force=TRUE is given. Upon force=TRUE new gene
positions are calculated regardless of existing tables.)
signature(object="ucscGenome"):
Returns data.frame containing gene-specific data.
signature(object="ensemblGenome"):
Extracts table object which contains tabled 'transcript_name' column
of gtf table
signature(object="ensemblGenome"):
Extracts table object which contains tabled 'transcript_id' column
of gtf table
signature(object = "refGenome"):
Copies content of gtf, attr and xref table to database.
Wolfgang Kaisers
http://www.ensembl.org/info/data/ftp/index.html http://mblab.wustl.edu/GTF22.html#fields
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 | ##-------------------------------------##
## Create an instance from scratch
## Real data:
## ftp://ftp.ensembl.org/pub/release-80/gtf/homo_sapiens/Homo_sapiens.GRCh38.80.gtf.gz
##-------------------------------------##
ens <- ensemblGenome()
basedir(ens) <- system.file("extdata",package="refGenome")
ens_gtf <- "hs.ensembl.62.small.gtf"
read.gtf(ens,ens_gtf)
# Load a previously saved genome:
ensfile <- system.file("extdata", "hs.ensembl.62.small.RData", package="refGenome")
ens <- loadGenome(ensfile)
##-------------------------------------##
## Saving and loading
## Save as R-image (fast loading)
##-------------------------------------##
basedir(ens) <- getwd()
saveGenome(ens, "hs.ensembl.62.small.RData", useBasedir=FALSE)
enr <- loadGenome("hs.ensembl.62.small.RData")
## Save as SQLite database
##-------------------------------------##
## Commented out because RSQlite
## seems to produce memory leaks
##-------------------------------------##
writeDB(ens, filename="ens62.db3", useBasedir=FALSE)
edb <- loadGenomeDb(filename="ens62.db3")
##-------------------------------------##
##Extract data for Primary Assembly seqids
##-------------------------------------##
enpa <- extractSeqids(ens,ensPrimAssembly())
# Tables all features in 'gtf' table
tableFeatures(enpa)
# Extract Coding sequences for Primary Assemblys
enpafeat <- extractFeature(enpa, "exon")
# Shortcut. Returns a data.frame
engen <- extractPaGenes(ens)
##-------------------------------------##
## Extract data for indival Genes
##-------------------------------------##
ddx <- extractByGeneName(ens, "DDX11L1")
ddx
tableTranscript.id(ddx)
tableTranscript.name(ddx)
fam <- extractTranscript(ens, "ENST00000417324")
fam
# Extract range limits of entire Genes
gp <- getGenePositions(ens)
gp
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.