Description Usage Arguments Details Note Author(s) Examples
These functions allow download a specific miRBase release, define miRNA host genes for miRNAs of a species, generate an SQLite database containing that information and ultimately build the corresponding annotation package inR.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | defineMirhostgenes(gff, database=c("core"), host="ensembldb.ensembl.org",
user="anonymous", pass, ensemblapi, verbose=FALSE)
downloadMirbase(version, path=".", force.download=FALSE)
fetchAdditionalInformation(mirbase.path=".", path=".", verbose=FALSE)
getArrayFeaturesForTx(species, arrays=c("HG-U133_Plus_2", "PrimeView"),
prop.probes=0.8, max.mm=0, min.probe.algn=24,
host="ensembldb.ensembl.org", user="anonymous",
pass, ensemblapi, verbose=FALSE)
makeHostgeneSQLiteFromTables(path=".")
makeMirhostgenesPackage(db, version, maintainer, author, destDir=".",
license="Artistic-2.0")
|
arrays |
Character vector specifying the types of microarrays for which probe sets should be searched for. Note that these have to correspond to the names of microarrays as available in the Ensembl databases. |
author |
The author of the package. |
database |
A character vector with the database name(s) that should be
queried. Allowed are |
db |
For |
destDir |
Where the package should be saved to. |
gff |
The gff file name containing the genomic alignments for the
pre-miRNAs. Such gff files (one per species) are located in the
genomes folder of the downloaded miRBase resource (e.g. downloaded
by the |
ensemblapi |
The path to the Ensembl perl API installed locally on the system. The Ensembl perl API version determines which Ensembl database version is queried. |
force.download |
Force the download of the miRBase even if the same version is already available locally. |
host |
The hostname to access the Ensembl database. |
license |
The license of the package. |
maintainer |
The maintainer of the package. |
max.mm |
Maximum number of mismatches of a probe with the target genes. |
min.probe.algn |
Minimal length of the probe alignment within the exons of a transcript. The default value of 24 means that all nucleotides of a 25nt long probe have to map within the exons of a transcript. |
mirbase.path |
For |
pass |
The password for the Ensembl database. |
path |
|
prop.probes |
Proportion of probes of a probe set that have to map within the exons of a transcript to be considered. The default value of 0.8 means that at least 80 percent of the probes of a probe set have to target the transcript. |
species |
For |
user |
The username for the Ensembl database. |
verbose |
print progress messages. |
version |
For For |
The downloadMirbase
and defineMirhostgenes
functions
internally call the perl scripts get-mirbase.pl
and
define_mirna_host_genes.pl
, respectively. The
define_mirna_host_genes.pl
needs the Perl API and bioperl to be
present in the PERL5LIB
environment variable.
The fetchAdditionalInformation
function extracts additional
informations such as the confidence information, read counts,
pre-miRNA sequences and miRNA family definitions from the downloaded
miRBase database tables and inserts it into tables for the
MirhostDb
database. This function should be called after
defineMirhostgenes
and before makeHostgeneSQLiteFromTables
.
The getArrayFeaturesForTx
again uses the Ensembl Perl API to
fetch, for the defined host transcripts, microarray features (probe
sets) possibly detecting the transcripts.
The makeHostgeneSQLiteFromTables
function reads all the txt
files generated by the defineMirhostgenes
function and builds
a SQLite database. If additional files "pre_mirna_sequence.txt"
and/or "mirna_fam.txt"
, created by the functions
createPremirnaSequenceTable
and createMirfamTable
are also
present, the information contained in these files will be added to the
database too.
The makeMirhostgenesPackage
finally creates an annotation
package based on the SQLite file generated above.
A local installation of the Ensembl perl API is required for the
defineMirhostgenes
. See
http://www.ensembl.org/info/docs/api/api_installation.html for
installation inscructions.
Johannes Rainer
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 | ## Not run:
library(mirhostgenes)
## Download mirbase version 20 (matching genome release 37)
downloadMirbase(version=20)
## Define miRNA host genes using the Ensembl core database.
## we're using the gff file for human miRNAs of the miRBase version we
## just downloaded.
## we set v=TRUE to get some feedback about the progress.
defineMirhostgenes(gff="20/genomes/hsa.gff3", v=TRUE)
## Fetch additional information from downloaded miRBase files:
## o pre-miRNA sequence data.
## o miRNA family information.
## o pre- and mature miRNA confidence data.
## o pre- and mature miRNA read count data.
fetchAdditionalInformation(mirbase.path="20/")
## Add probe features... for Affymetrix microarrays. It is crucial that
## the species matches!
## We do also specify form which microarrays we want to fetch the probes/
## probe sets.
getArrayFeaturesForTx(species="human", arrays=c("HG-U133_Plus_2", "PrimeView"))
## Build the SQLite database from the generated txt files.
DBNAME <- makeHostgeneSQLiteFromTables()
## Build a R package providing the annotation database.
makeMirhostgenesPackage(DBNAME,
version="0.0.1",
maintainer="Johannes Rainer <johannes.rainer@eurac.edu>",
author="J Rainer"
)
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.