Description Usage Arguments Details Value Note Author(s) References See Also Examples
This function takes as input the eigengenes of all modules and learns a Bayesian network using bnlearn package. It builds several individual networks from random staring networks by optimizing their score. Then, it infers a consensus network from the ones with relatively "higher" scores. The default hyper-parameters and arguments should be fine for most applications.
1 2 3 4 5 6 7 8 9 | learn.bn(pigengene=NULL, Data=NULL, Labels=NULL, bnPath = "bn", bnNum = 100,
consensusRatio = 1/3, consensusThresh = "Auto", doME0 = FALSE,
selectedFeatures = NULL, trainingCases = "All", algo = "hc", scoring = "bde",
restart = 0, pertFrac = 0.1, doShuffle = TRUE, use.Hartemink = TRUE,
bnStartFile = "None", use.Disease = TRUE, use.Effect = FALSE, dummies = NULL,
tasks = "All", onCluster = !(which.cluster()$cluster == "local"),
inds = 1:ceiling(bnNum/perJob), perJob = 2, maxSeconds = 5 * 60,
timeJob = "00:10:00", bnCalculationJob = NULL, seed = NULL, verbose = 0,
naTolerance=0.05)
|
pigengene |
An object from |
Data |
A matrix or data frame containing the training data with eigengenes corresponding to columns and rows corresponding to samples. Rows and columns must be named. |
Labels |
A (preferably named) vector containing the Labels (condition types) for
the training data. Names must agree with rows of |
bnPath |
The path to save the results |
bnNum |
The total number of individual networks. In practice, the
number of learnt networks can be less than |
consensusRatio |
A numeric in the range |
consensusThresh |
A vector of thresholds in the range |
doME0 |
If |
selectedFeatures |
A character vector. If not |
trainingCases |
A character vector that determines which cases (samples) should be considered for learning the network. |
algo |
The algorithm that bnlean uses for optimizing the score. The
default is "hc" (hill climbing).
See |
scoring |
A character determining the scoring criteria. Use 'bde' and 'bic' for
the Bayesian Dirichlet equivalent and Bayesian Information Criterion scores,
respectively. See |
restart |
The number of random restarts. For technical use only.
See |
pertFrac |
A numeric in the range |
doShuffle |
The ordering of the features (eigengenes) is important in
making the initial network. If |
use.Hartemink |
If |
bnStartFile |
Optionally, learning can start from a Bayesian network instead of a random network.
|
use.Disease |
If |
use.Effect |
If |
dummies |
A vector of numeric values in the range |
tasks |
A character vector and a subset of |
onCluster |
A Boolean variable that is |
inds |
The indices of the jobs that are included in the analysis. |
perJob |
The number of individual networks that are learnt by 1 job. |
maxSeconds |
An integer limiting computation time for each training job that runs locally,
i.e., when |
timeJob |
The time in |
bnCalculationJob |
A script used to submit jobs to the cluster. Set to |
seed |
The random seed that can be set to an integer to reproduce the same results. |
verbose |
Integer level of verbosity. 0 means silent and higher values produce more details of computation. |
naTolerance |
Upper threshold on the fraction of entries per gene that
can be missing. Genes with a larger fraction of missing
entries are ignored. For genes with smaller fraction of NA
entries, the missing values are imputed from their average
expression in the other samples.
See |
For learning a Bayesian network with tens of nodes (eigengenes), bnNum=1000
or higher is recommended. Increasing consensusThresh
generally results
in a network with fewer arcs. Nagarajan et al. proposed a fundamental approach that
determines this hyper-parameter based on the background noise. They use non-parametric
bootstrapping, which is not implemented in the current package yet.
The default values for the rest of the hyper-parameters should be fine for most applications.
A list of:
consensusThresh |
The vector of thresholds as described in the arguments. |
indvPath |
The path where the individual networks were saved. |
moduleFile |
The file containing data in appropriate format for bnlearn package and the blacklist arcs. |
scoreFile |
The file containing the record of the successively jobs and the scores of the corresponding individual networks. |
consensusFile |
The file containing the consensus network and its BDe and BIC scores. |
bnModuleRes |
The result of |
runs |
A list containing the record of successful jobs. |
scores |
The list saved in |
consensusThreshRes |
The full output of |
consensus1 |
The consensus Bayesian network corresponding to the first threshold.
It is the output of |
scorePlot |
The output of |
graphs |
The output of |
timeTaken |
An object of |
use.Disease, use.Effect, use.Hartemink |
Some of the input arguments. |
Running the jobs on a cluster needs bnCalculationJob
script, which is NOT included in the package yet.
Amir Foroushani, Habil Zare, and Rupesh Agrahari
Hartemink A (2001). Principled Computational Methods for the Validation and Discovery of Genetic Regulatory Networks. Ph.D. thesis, School of Electrical Engineering and Computer Science, Massachusetts Institute of Technology.
Nagarajan, Radhakrishnan, et al. (2010) Functional relationships between genes associated with differentiation potential of aged myogenic progenitors. Frontiers in Physiology 1.
bnlearn-package
, Pigengene-package
,
compute.pigengene
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 | data(eigengenes33)
ms <- 10:20 ## A subset of modules for quick demonstration
amlE <- eigengenes33$aml[,ms]
mdsE <- eigengenes33$mds[,ms]
eigengenes <- rbind(amlE,mdsE)
Labels <- c(rep("AML",nrow(amlE)),rep("MDS",nrow(mdsE)))
names(Labels) <- rownames(eigengenes)
learnt <- learn.bn(Data=eigengenes, Labels=Labels,
bnPath="bnExample", bnNum=10, seed=1)
bn <- learnt$consensus1$BN
## Visualize:
d1 <- draw.bn(BN=bn,nodeFontSize=14)
## What are the children of the Disease node?
childrenD <- bnlearn::children(x=bn, node="Disease")
print(childrenD)
## Fit the parameters of the Bayesian network:
fit <- bnlearn::bn.fit(x=bn, data=learnt$consensus1$Data, method="bayes",iss=10)
## The conditional probability table for a child of the Disease node:
fit[[childrenD[1]]]
## The fitted Bayesian network can be used for predicting the labels
## (i.e., values of the Disease node).
l2 <- predict(object=fit, node="Disease", data=learnt$consensus1$Data, method="bayes-lw")
table(Labels, l2)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.