Description Usage Arguments Details Value Note Author(s) See Also Examples
ensembldb
supports most of the filters from the
AnnotationFilter
package to retrieve specific content from
EnsDb
databases. These filters can be passed to
the methods such as genes
with the filter
parameter
or can be added as a global filter to an EnsDb
object
(see addFilter
for more details). Use the
supportedFilters
to list all filters supported for
EnsDb
object.
supportedFilters
returns a data.frame
with the
names of all filters and the corresponding field supported by the
EnsDb
object.
seqnames
: accessor for the sequence names of the
GRanges
object within a GRangesFilter
seqnames
: accessor for the seqlevels
of the
GRanges
object within a GRangesFilter
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | OnlyCodingTxFilter()
ProtDomIdFilter(value, condition = "==")
UniprotDbFilter(value, condition = "==")
UniprotMappingTypeFilter(value, condition = "==")
TxSupportLevelFilter(value, condition = "==")
## S4 method for signature 'EnsDb'
supportedFilters(object, ...)
## S4 method for signature 'GRangesFilter'
seqnames(x)
## S4 method for signature 'GRangesFilter'
seqlevels(x)
|
value |
The value(s) for the filter. For
|
condition |
|
object |
For |
... |
For |
x |
For |
ensembldb
supports the following filters from the
AnnotationFilter
package:
filter based on the Ensembl gene ID.
filter based on the name of the gene as provided by Ensembl. In most cases this will correspond to the official gene symbol.
filter based on the gene names. EnsDb
objects don't
have a dedicated symbol column, the filtering is hence based on the
gene names.
filter based on the biotype of genes (e.g. "protein_coding"
).
filter based on the genomic start coordinate of genes.
filter based on the genomic end coordinate of genes.
filter based on the genes' NCBI Entrezgene ID.
filter based on the Ensembld transcript ID.
filter based on the Ensembld transcript ID; no transcript names are
provided in EnsDb
databases.
filter based on the transcripts' biotype.
filter based on the genomic start coordinate of the transcripts.
filter based on the genonic end coordinates of the transcripts.
filter based on Ensembl exon IDs.
filter based on the index/rank of the exon within the transcrips.
filter based on the genomic start coordinates of the exons.
filter based on the genomic end coordinates of the exons.
Allows to fetch features within or overlapping specified genomic region(s)/
range(s). This filter takes a GRanges
object
as input and, if type = "any"
(the default) will restrict
results to features (genes, transcripts or exons) that are partially
overlapping the region. Alternatively, by specifying
condition = "within"
it will return features located within the
range. In addition, the GRangesFilter
supports condition = "start"
, condition = "end"
and
condition = "equal"
filtering for features with the same start or
end coordinate or that are equal to the GRanges
.
Note that the type of feature on which the filter is applied depends on
the method that is called, i.e. genes
will filter on the
genomic coordinates of genes, transcripts
on those of
transcripts and exons
on exon coordinates.
Calls to the methods exonsBy
, cdsBy
and
transcriptsBy
use the start and end coordinates of the
feature type specified with argument by
(i.e. "gene"
,
"transcript"
or "exon"
) for the filtering.
If the specified GRanges
object defines multiple regions, all
features within (or overlapping) any of these regions are returned.
Chromosome names/seqnames can be provided in UCSC format (e.g.
"chrX"
) or Ensembl format (e.g. "X"
); see
seqlevelsStyle
for more information.
filter based on chromosome names.
filter based on the chromosome strand. The strand can be specified with
value = "+"
, value = "-"
, value = -1
or
value = 1
.
filter based on Ensembl protein IDs. This filter is only supported if the
EnsDb
provides protein annotations; use the
hasProteinData
method to evaluate.
filter based on Uniprot IDs. This filter is only supported if the
EnsDb
provides protein annotations; use the
hasProteinData
method to evaluate.
In addition, the following filters are defined by ensembldb
:
allows to filter results using the provided transcript support level.
Support levels for transcripts are defined by Ensembl based on the
available evidences for a transcript with 1 being the highest evidence
grade and 5 the lowest level. This filter is only supported on
EnsDb
databases with a db schema version higher 2.1.
allows to filter results based on the specified Uniprot database name(s).
allows to filter results based on the mapping method/type that was used to assign Uniprot IDs to Ensembl protein IDs.
allows to retrieve entries from the database matching the provided filter criteria based on their protein domain ID (protein_domain_id).
allows to retrieve entries only for protein coding transcripts, i.e. transcripts with a CDS. This filter does not take any input arguments.
For ProtDomIdFilter
: A ProtDomIdFilter
object.
For UniprotDbFilter
: A UniprotDbFilter
object.
For UniprotMappingTypeFilter
: A
UniprotMappingTypeFilter
object.
For TxSupportLevel
: A
TxSupportLevel
object.
For supportedFilters
: a data.frame
with the names and
the corresponding field of the supported filter classes.
For users of ensembldb
version < 2.0: in the
GRangesFilter
from the
AnnotationFilter
package the condition
parameter was
renamed to type
(to be consistent with the IRanges
package)
. In addition, the condition = "overlapping"
is no longer
recognized. To retrieve all features overlapping the range
type = "any"
has to be used.
Protein annotation based filters can only be used if the
EnsDb
database contains protein annotations, i.e.
if hasProteinData
is TRUE
. Also, only protein coding
transcripts will have protein annotations available, thus, non-coding
transcripts/genes will not be returned by the queries using protein
annotation filters.
Johannes Rainer
supportedFilters
to list all filters supported for EnsDb
objects.
listUniprotDbs
and listUniprotMappingTypes
to
list all Uniprot database names respectively mapping method types from
the database.
GeneIdFilter
for more details on the
filter objects.
genes
, transcripts
, exons
,
listGenebiotypes
, listTxbiotypes
.
addFilter
for globally adding filters to an EnsDb
object.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 | ## Create a filter that could be used to retrieve all informations for
## the respective gene.
gif <- GeneIdFilter("ENSG00000012817")
gif
## Create a filter for a chromosomal end position of a gene
sef <- GeneEndFilter(10000, condition = ">")
sef
## For additional examples see the help page of "genes".
## Example for GRangesFilter:
## retrieve all genes overlapping the specified region
grf <- GRangesFilter(GRanges("11", ranges = IRanges(114000000, 114000050),
strand = "+"), type = "any")
library(EnsDb.Hsapiens.v75)
edb <- EnsDb.Hsapiens.v75
genes(edb, filter = grf)
## Get also all transcripts overlapping that region.
transcripts(edb, filter = grf)
## Retrieve all transcripts for the above gene
gn <- genes(edb, filter = grf)
txs <- transcripts(edb, filter = GenenameFilter(gn$gene_name))
## Next we simply plot their start and end coordinates.
plot(3, 3, pch=NA, xlim=c(start(gn), end(gn)), ylim=c(0, length(txs)),
yaxt="n", ylab="")
## Highlight the GRangesFilter region
rect(xleft=start(grf), xright=end(grf), ybottom=0, ytop=length(txs),
col="red", border="red")
for(i in 1:length(txs)){
current <- txs[i]
rect(xleft=start(current), xright=end(current), ybottom=i-0.975, ytop=i-0.125, border="grey")
text(start(current), y=i-0.5,pos=4, cex=0.75, labels=current$tx_id)
}
## Thus, we can see that only 4 transcripts of that gene are indeed
## overlapping the region.
## No exon is overlapping that region, thus we're not getting anything
exons(edb, filter = grf)
## Example for ExonRankFilter
## Extract all exons 1 and (if present) 2 for all genes encoded on the
## Y chromosome
exons(edb, columns = c("tx_id", "exon_idx"),
filter=list(SeqNameFilter("Y"),
ExonRankFilter(3, condition = "<")))
## Get all transcripts for the gene SKA2
transcripts(edb, filter = GenenameFilter("SKA2"))
## Which is the same as using a SymbolFilter
transcripts(edb, filter = SymbolFilter("SKA2"))
## Create a ProteinIdFilter:
pf <- ProteinIdFilter("ENSP00000362111")
pf
## Using this filter would retrieve all database entries that are associated
## with a protein with the ID "ENSP00000362111"
if (hasProteinData(edb)) {
res <- genes(edb, filter = pf)
res
}
## UniprotFilter:
uf <- UniprotFilter("O60762")
## Get the transcripts encoding that protein:
if (hasProteinData(edb)) {
transcripts(edb, filter = uf)
## The mapping Ensembl protein ID to Uniprot ID can however be 1:n:
transcripts(edb, filter = TxIdFilter("ENST00000371588"),
columns = c("protein_id", "uniprot_id"))
}
## ProtDomIdFilter:
pdf <- ProtDomIdFilter("PF00335")
## Also here we could get all transcripts related to that protein domain
if (hasProteinData(edb)) {
transcripts(edb, filter = pdf, columns = "protein_id")
}
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.