Description Usage Arguments Details Value References See Also Examples
Function buildGraphFromKEGGREST
makes use of the KEGG
REST API (requires internet connection)
to build and return the curated KEGG graph.
Function buildDataFromGraph
takes as input the KEGG graph
generated by buildGraphFromKEGGREST
and writes the KEGG knowledge model in the desired permanent directory.
Function loadKEGGdata
loads the internal files
containing the KEGG knowledge model into a
FELLA.DATA
object.
In general, generateGraphFromKEGGREST
and
generateDataFromGraph
are one-time executions
for a given organism and knowledge model,
in this precise order.
On the other hand, the user needs to run loadKEGGdata
in every new R session to load such model into a
FELLA.DATA
object.
1 2 3 4 5 6 7 8 9 | buildGraphFromKEGGREST(organism = "hsa", filter.path = NULL)
buildDataFromGraph(keggdata.graph = NULL, databaseDir = NULL,
internalDir = TRUE, matrices = c("hypergeom", "diffusion",
"pagerank"), normality = c("diffusion", "pagerank"),
dampingFactor = 0.85, niter = 100)
loadKEGGdata(databaseDir = tail(listInternalDatabases(), 1),
internalDir = TRUE, loadMatrix = NULL)
|
organism |
Character, KEGG code for the organism of interest |
filter.path |
Character vector, pathways to filter.
This is a pattern matched using regexp.
E.g: |
keggdata.graph |
An igraph
object generated by the function
|
databaseDir |
Character containing the directory to save KEGG files.
It is a relative directory inside the library location
if |
internalDir |
Logical, should the directory be internal in the package directory? |
matrices |
A character vector, containing any of these:
|
normality |
A character vector, containing any of these:
|
dampingFactor |
Numeric value between 0 and 1 (none inclusive),
damping factor |
niter |
Numeric value, number of iterations to estimate the p-values for the CC size. Between 10 and 1e3. |
loadMatrix |
Character vector to choose if
heavy matrices should be loaded.
Can contain: |
In function buildGraphFromKEGGREST
,
The user specifies (i) an organism, and (ii) patterns matching
pathways that should not be included as nodes.
A graph object, as described in [Picart-Armada, 2017],
is built from the comprehensive
KEGG database [Kanehisa, 2017].
As described in the main vignette, accessible through
browseVignettes("FELLA")
, this graph has five levels that
represent categories of KEGG nodes.
From top to bottom: pathways, modules, enzymes, reactions and compounds.
This knowledge representation is resemblant to the one formerly
used by MetScape [Karnovsky, 2011], in which enzymes connect
to genes instead of modules and pathways.
The necessary KEGG annotations
are retrieved through KEGGREST R package [Tenenbaum, 2013].
Connections between pathways/modules and enzymes are inferred through
organism-specific genes, i.e. an edge is added if a gene
connects both entries.
However, in order to enrich metabolomics data, the user has to
pass the graph object to buildDataFromGraph
to obtain the FELLA.USER
object.
All the networks are handled with the igraph R package [Csardi, 2006].
Using buildDataFromGraph
is the second step
to use the FELLA
package.
The knoledge graph is used to compute other internal variables that are
required to run any enrichment.
The main point behind the enrichment is to provide a small
part of the knowledge graph relevant to the supplied metabolites.
This is accomplished through diffusion processes and random walks,
followed by a statistical normalisation,
as described in [Picart-Armada, 2017].
When building the internal files,
the user can choose whether to store (i) matrices for each
provided method, and (ii) vectors derived from such matrices
to use the parametric approaches.
These are optional but enable (i) faster permutations and custom
metabolite backgrounds, and (ii) parametric approaches.
WARNING: diffusion and PageRank matrices in (i)
can allocate up to 250MB each.
On the other hand, the niter
parameter
controls the amount of trials to approximate the
distribution of the connected component size under
uniform node sampling.
For further info, see the option thresholdConnectedComponent
in the details from ?generateResultsGraph
.
Regarding the destination, the user can specify
the name of the directory.
Otherwise a name containing the creation date, the organism
and the KEGG release will be used.
The database can be stored within the library path or in a
custom location.
Function loadKEGGdata
returns a
FELLA.DATA
object from any of the
databases generated by FELLA.DATA
.
This object is the starting point of any enrichment
using FELLA
.
In case the user built the matrices for "diffusion" and "pagerank",
he or she can choose to load them.
Further detail on the methods can be found in [Picart-Armada, 2017].
The matrices allow a faster computation and the definition
of a custom background, but use up to 250MB of memory each.
buildGraphFromKEGGREST
returns the
curated KEGG graph (class igraph)
buildDataFromGraph
returns
invisible(TRUE)
if successful.
As a side effect, the
directory outdir
is created, containing
the internal data.
loadKEGGdata
returns the
FELLA.DATA
object
that contains the KEGG knowledge representation.
Kanehisa, M., Furumichi, M., Tanabe, M., Sato, Y., & Morishima, K. (2017). KEGG: new perspectives on genomes, pathways, diseases and drugs. Nucleic acids research, 45(D1), D353-D361.
Karnovsky, A., Weymouth, T., Hull, T., Tarcea, V. G., Scardoni, G., Laudanna, C., ... & Athey, B. (2011). Metscape 2 bioinformatics tool for the analysis and visualization of metabolomics and gene expression data. Bioinformatics, 28(3), 373-380.
Tenenbaum, D. (2013). KEGGREST: Client-side REST access to KEGG. R package version, 1(1).
Chang, W., Cheng, J., Allaire, JJ., Xie, Y., & McPherson, J. (2017). shiny: Web Application Framework for R. R package version 1.0.5. https://CRAN.R-project.org/package=shiny
Picart-Armada, S., Fernandez-Albert, F., Vinaixa, M., Rodriguez, M. A., Aivio, S., Stracker, T. H., Yanes, O., & Perera-Lluna, A. (2017). Null diffusion-based enrichment for metabolomics data. PLOS ONE, 12(12), e0189012.
class FELLA.DATA
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 | ## Toy example
## In this case, the graph is not built from current KEGG.
## It is loaded from sample data in FELLA
data("FELLA.sample")
## Graph to build the database (this example is a bit hacky)
g.sample <- FELLA:::getGraph(FELLA.sample)
dir.tmp <- paste0(tempdir(), "/", paste(sample(letters), collapse = ""))
## Build internal files in a temporary directory
buildDataFromGraph(
keggdata.graph = g.sample,
databaseDir = dir.tmp,
internalDir = FALSE,
matrices = NULL,
normality = NULL,
dampingFactor = 0.85,
niter = 10)
## Load database
myFELLA.DATA <- loadKEGGdata(
dir.tmp,
internalDir = FALSE)
myFELLA.DATA
######################
## Not run:
## Full example
## First step: graph for Mus musculus discarding the mmu01100 pathway
## (an analog example can be built from human using organism = "hsa")
g.mmu <- buildGraphFromKEGGREST(
organism = "mmu",
filter.path = "mmu01100")
summary(g.mmu)
cat(comment(g.mmu))
## Second step: build internal files for this graph
## (consumes some time and memory, especially if we compute
"diffusion" and "pagerank" matrices)
buildDataFromGraph(
keggdata.graph = g.mmu,
databaseDir = "example_db_mmu",
internalDir = TRUE,
matrices = c("hypergeom", "diffusion", "pagerank"),
normality = c("diffusion", "pagerank"),
dampingFactor = 0.85,
niter = 1e3)
## Third step: load the internal files into a FELLA.DATA object
FELLA.DATA.mmu <- loadKEGGdata(
"example_db_mmu",
internalDir = TRUE,
loadMatrix = c("diffusion", "pagerank"))
FELLA.DATA.mmu
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.