Description Usage Arguments Value Author(s) References See Also Examples
This methods applies a Hierarchical Dirichlet Process (HDP) algorithm on the collection of proteins networks to infer the set of chromatin loop-maintainer proteins. HDP are non-parametric Bayesian models widely used in document classification as it enables us to model datasets with a mixtures of classes. In our case, we suppose that different kinds of networks are involved in maintaining the different loops. Thus, to make an analogy with topic modeling, we each DNA-interaction maintaining protein network as a document and each edge in this network as word. Thus, the task is to say which word (edge) belongs to which topic (chromatin-maintainer family). The method implementation is based on the C++ code of Chong Wang and David Blei with adaptation to Rcpp and removal of the dependency on the Gnu Scientific Library.
1 2 | ## S4 method for signature 'NetworkCollection'
InferNetworks(object,thr =0.5,max_iter = 500L, max_time = 3600L, ...)
|
object |
a |
thr |
Used to select the top protein interaction in each inferred chromatin-maintainer family.
In HDP each topic (Chromatin-maintainer family) is considered as a distribution over words (edges),
thus, for each topic we consider the words that capture |
max_iter |
maximum number of iterations (befault 500). |
max_time |
maximum runing time (3600 sec). |
... |
You can pass additional paramters to control the behaviour of the HDP model. The possible paramters are eta, alpha and gamma. eta controls how edges are assigned to CMNs on the global level. smaller eta values will lead to sparce edge-to-CMN assignment, which eta >1 leads to more uniform assignments. gamma on the other hand, controls the number of CMNs, smaller gamma values will produce a small number of CMNs and gamma>1 will favor the generation of more. alpha controls the sparcity at the local PPI. smaller alpha value force edges to be conrolled by a small number of CMNs, while lagrger values leads to more uniform distribution. By default eta = 0.01, gamma =1 and alpha =1. |
Returns a ChromMaintainers
object that contains the list of inferred networks and the probability
of each edge in each network.
Mohamed Nadhir Djekidel (nde12@mails.tsinghua.edu.cn)
https://www.cs.princeton.edu/~blei/topicmodeling.html (C. Wang's hdp code)
Chong Wang, John Paisley and David M. Blei, Online variational inference for the hierarchical Dirichlet process .In AISTATS 2011
Mohamed Nadhir D, Yang C et al 3CPET: Finding Co-factor Complexes in Chia-PET experiment using a Hierarchical Dirichlet Process, ....
NetworkCollection
, ChromMaintainers
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | ## get the different datasets path
petFile <- file.path(system.file("example",package="R3CPET"),"HepG2_interactions.txt")
tfbsFile <- file.path(system.file("example",package="R3CPET"),"HepG2_TF.txt.gz")
## Not run:
x <- ChiapetExperimentData(pet = petFile, tfbs= tfbsFile, IsBed = FALSE, ppiType="HPRD", filter= TRUE)
## build the different indexes
x <- createIndexes(x)
## build networks connecting each interacting regions
nets<- buildNetworks(x)
## infer the networks
hlda<- InferNetworks(nets)
hlda
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.