Description Usage Arguments Details Value Note Author(s) References See Also Examples
Main function for DDEPN modelling. Takes a data matrix containing longitudinal measurements as argument and infers a network structure underlying the data using either a genetic algorithm or MCMC sampling.
1 2 3 4 5 6 7 8 9 10 11 12 13 | ddepn(dat, phiorig=NULL, phi=NULL, th=0.8, inference="netga",
outfile=NULL, multicores=FALSE, maxiterations=1000,
p=500, q=0.3, m=0.8, P=NULL,
usebics=TRUE, cores=1, priortype="laplaceinhib",
lambda=NULL, B=NULL, samplelambda=NULL,
hmmiterations=100, fanin=4,
gam=NULL,it=NULL,K=NULL,quantL=.5,quantBIC=.5,
debug=0, burnin=500, thin=FALSE, plotresults=TRUE,
always_sample_sf=FALSE, scale_lik=FALSE, allow.stim.off=FALSE,
implementation="C")
resume_ddepn(ret,maxiterations=10000,outfile=NULL,th=0.8,plotresults=TRUE,
debug=0,cores=NULL, implementation="C", thin=FALSE)
|
dat |
Matrix of double values. The data matrix to be used. Contains antibody measurements in the rows and experiments (T timepoints in each R replicates) in the columns. Each experiment is labeled by the respective perturbation in the column name. See section Details for an example. |
phiorig |
Adjacency matrix. Reference network used for comparison to the inferred net. Entries can be either 0, 1 or 2, for no edge, activation or inhibition, respectively. NULL if no reference network is given. |
phi |
Adjacency matrix. Seed network to start the search. Entries can be either 0, 1 or 2, for no edge, activation or inhibition, respectively. NULL if no start network should be given, but initialised automatically. |
th |
Threshold for inclusion of an edge in the final network
(for |
inference |
String. Giving the type of network search. |
outfile |
String. Output path for plotting. NULL if plotting should be done to the display. |
multicores |
Boolean. TRUE for using multiple cores and
parallelise the network reconstruction. In case of |
maxiterations |
Integer, Maximum number of generations in
|
p |
Integer, number of individuals in the population in
|
q |
Double \in [0;1], selection (1-q) and crossover (q)
rate in |
m |
Double \in [0;1], mutation rate in |
P |
List containing an initial population of networks
for |
usebics |
Use BIC statistic for model selection (only for
|
cores |
Number of cores to use in case of |
hmmiterations |
Integer. Maximum number of iterations in the HMM search. |
lambda |
NULL, Numeric or NA. The Prior influence hyperparameter for the laplace prior. If
numeric, used as fixed prior strength or starting value for prior strength sampling
(when |
B |
The Prior information matrix. See |
fanin |
Integer: maximal indegree for each node. |
gam |
Prior influence strength for scalefree prior. Also used as exponent
in |
it |
Number of iterations to generate the background distribution for scalefree prior. |
K |
Proportionality factor for scalefree prior. |
quantL |
Quantile of Population Likelihood/Posterior, used as
selection threshold in |
quantBIC |
Quantile of Population BIC, used as selection
threshold in |
samplelambda |
Numeric or NULL. If NULL, the Laplace hyperparameter |
debug |
Numeric. If 0, a status bar indicates the progress of the algorithm. If 1 or 2,
extra information is printed to the console (for |
burnin |
Integer. Specifies the number of iterations used as
burnin phase for |
priortype |
Character. One of |
thin |
Boolean. If TRUE, makes sure that the MCMC return objects are shortened to at most 10000 iterations. Defaults to FALSE. |
plotresults |
Boolean. If TRUE, the resulting network(s) and in case of MCMC sampling, the score traces are plotted. |
always_sample_sf |
Boolean. Update scaling factor in inhibMCMC sampling through the whole sampling if TRUE. Keep scaling factor fixed after burn-in if FALSE. |
scale_lik |
Boolean. Perform scaling of the likelihood according to how many data points were used to calculate the overall likelihood. |
allow.stim.off |
Boolean. If TRUE, the stimulus can become passive at some time. This will generate additional reachable system states, in particular all states from the normal state matrix, generated by the propagation, but with the stimulus node set to 0. |
ret |
List. The output generated during an |
implementation |
String. One of |
Data matrix. Rows correspond to measured proteins/genes etc.
Columns contain all experiments, i.e. separate perturbations.
Each experiment i
consists of T_i
time points and each time point is
assumed to be measured in R_i
replicates. The time is indicated as a
numeric value, separated by an underscore in the column name.
Example:
EGF_1 | EGF_1 | EGF_2 | EGF_2 | EGF&X_1 | EGF&X_2 | EGF&X_2 | EGF&X_2 | |
EGF | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
X | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
AKT | 1.45 | 1.8 | 0.99 | 1.6 | 1.78 | 1.8 | 1.56 | 1.58 |
ERK | 1.33 | 1.7 | 1.57 | 1.3 | 0.68 | 0.34 | 0.62 | 0.47 |
MEK | 0.45 | 0.8 | 0.99 | 0.6 | 0.78 | 0.8 | 0.56 | 0.58 |
For example, EGF_1
means EGF treatment at time 1, EGF&X_2
means simultaneous treatment with EGF and X at time 2
etc. One could use function addstimuli
to automatically add the additional rows for the
treatments to the data matrix, if they are not present. Unequal numbers of time points and replicates are allowed
for each experiment. See the vignette for more details on the format of the data matrix.
Several implementations are provided, differing in the way that the Gaussian parameters are
estimated in the HMM. The "R"
and "C"
implementations derive separate optimal state matrices
for each provided experiment. The state matrices are then concatenated to estimate the Gaussians. An alternative
experimental implementation "R_globalest"
is available, which derives a single state matrix for all experiments
in the HMM. For separate derivation, the corresponding gaussians for each experiment can be rather different,
leading to rather inhomogeneous parameter estimates with large variances. Using only one HMM for all experiments
overcomes this problem, since the states are chosen with respect to all experiments. However, deriving the combined
state matrix leads to higher number of possible system states to be regarded in the viterbi algorithm, and this will slow down
the HMM. The default is to use "C"
with a reasonable trade off of quality and speed.
For netga
, a list containing the following elements:
dat |
Double matrix. The data matrix. |
phi.activation.count |
Integer. Counts how often an edge is an activation in the population. |
phi.inhibition.count |
Integer. Counts how often an edge is an inhibition in the population. |
phi.orig |
Adjacency matrix. The reference network, if it was provided. |
phi |
Adjacency matrix. The inferred network |
weights |
Matrix. Each entry is the maximum of the conf.act/conf.inh entries. I.e. this describes the support for an edge in the final network. |
weights.tc |
Matrix. Similar to weights, but calculated ignoring the types of the edges. |
stats |
Matrix. Contains result statistics for each network in the
population: TP, FP, TN, FN, Sensitivity(SN), Specificity(SP), precision, F1.
Only present if a reference network |
conf.act |
Matrix. Calculated as phi.activation.count/p |
conf.inh |
Matrix. Calculated as phi.inhibition.count/p |
stimuli |
List. The list of the input stimuli in format |
P |
List. The population of networks that was inferred, i.e. the
return list of |
scorestats |
Matrix. Contains traces of the scores during the genetic
algorithm. See |
For mcmc
, a list containing two elements:
samplings |
List. Contains all sampling runs. Each sampling run itself
is a list as obtained via |
ltraces |
Matrix. Contains the posterior traces, each trace stored in one column of the matrix. |
TODO
Christian Bender
DDEPN
Bender et. al. 2010: Dynamic deterministic effects propagation networks: learning
signalling pathways from longitudinal protein array data; Bioinformatics,
Vol. 26(18), pp. i596-i602
Laplace prior
Bender, C. 2011: Systematic analysis of time resolved high-throughput data using stochastic network inference methods;
PhD Thesis, University of Heidelberg, Combined Faculties for the Natural Sciences and for Mathematics, 2011
Froehlich et. al. 2007, Large scale statistical inference of signaling pathways from RNAi
and microarray data; BMC Bioinformatics, Vol. 8(11), pp. 386ff
Scale free prior
Kamimura and Shimodaira, A Scale-free Prior over Graph Structures for Bayesian
Inference of Gene Networks
TODO
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 | ## Not run:
## load package
library(ddepn)
## sample a network
n <- 6
signet <- signalnetwork(n=n, nstim=2, cstim=0, prop.inh=0.2)
phit <- signet$phi
stimuli <- signet$stimuli
## sample data
dataset <- makedata(phit, stimuli, mu.bg=1200, sd.bg=400,
mu.signal.a=2000, sd.signal.a=1000)
## use original network as prior matrix
## reset all entries for inhibiting edges
## to -1
B <- phit
B[B==2] <- -1
## Genetic algorithm, no prior
ret1 <- ddepn(dataset$datx, phiorig=phit, inference="netga",
maxiterations=30, p=15, q=0.3, m=0.8,
usebics=TRUE)
x11()
plotdetailed(ret1$phi,stimuli=ret1$stimuli)
## mcmc, laplaceinhib prior
ret2 <- ddepn(dataset$datx,phiorig=phit, inference="mcmc",
maxiterations=300, burnin=100,
usebics=FALSE, lambda=0.01, B=B, gam=1,
priortype="laplaceinhib")
x11()
plotdetailed(ret2$samplings[[1]]$phi,stimuli=ret2$samplings[[1]]$stimuli)
## use mcmc with multiple cores, i.e. perform two independent runs
## requires package multicore and, of course multiple cores in the hardware
## use the original net as prior
if(require(parallel)) {
ret3 <- ddepn(dataset$datx,phiorig=phit, inference="mcmc",
multicores=TRUE, cores=2,
maxiterations=300, burnin=100,
usebics=FALSE, lambda=0.01, B=B, gam=1,
priortype="laplaceinhib")
}
## resuming the inference from an inhibMCMC run and add another 100 iterations
ret4 <- ddepn(dataset$datx,phiorig=phit, inference="mcmc",
maxiterations=100, burnin=30, lambda=0.01, B=B,
priortype="laplaceinhib", usebics=FALSE)
ret4 <- resume_ddepn(ret4,maxiterations=100)
## resuming the inference from an netga run and add another 30 iterations
ret5 <- ddepn(dataset$datx,phiorig=phit, inference="netga",
maxiterations=20, p=10, q=0.3, m=0.8, lambda=0.01, B=B,
priortype="laplaceinhib", usebics=FALSE)
ret5 <- resume_ddepn(ret5,maxiterations=30)
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.