knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
dynUGENE (dynamical Uncertainty-aware GEne Network Inference with Ensembles of trees) is an extension of dynGENIE3. Given time series or steady state data a set of species (mRNA/protein/ion concentrations), dynUGENE provides tools for the visualization, analysis, and simulation of the results of the dynGENIE3 method.
Suppose you have access to a time series dataset of the expression levels of $p$ genes in a gene regulatory network. The expression levels of each gene is measured at $N$ time points. Thus, the dataset can be organized as $$\mathcal{D} = {\mathbf{x}(t_1), \cdots, \mathbf{x}(t_N)}$$ where each $\mathbf{x}(t_k) \in \mathbb{R}, k=1, \cdots, N$ is a column vector storing the expression levels of all $p$ genes at some time step $t_k$. dynGENIE3 fits a random forest $f_j$ for each gene $x_j$ according to the following model of gene expression: $$\frac{dx_j(t)}{dt} = -\alpha_j x_j(t) + f_j(\mathbf{x}(t)),\; \forall\; j$$ where $\alpha_j$ is the degradation rate of gene $x_j$. This parameter can be preset by the user, or can be fitted from the data given the maximum and minumum of the expression levels and assuming exponential decay between these two points. In practice, the dataset is made of discrete time steps. To accomodate this, the learning sample for training the random forest for gene $x_j$ becomes instead $$\Bigg{ \mathbf{x}(t_k), \; \frac{x_j(t_{k+1}) - x_j(t_k)}{t_{k+1} - t_k} + \alpha_j x_j (t_k) \Bigg}_{k=1,\;\cdots,\;N-1}$$ Intuitively, we ask the random forest to predict the change in concentration of some gene $x_j$ and its degradation, given the current concentrations of all gene levels. Ranking the connectivity of the gene regulatory network is posed as $p$ different feature selection problems, where the importance of a directed edge from gene $x_j$ to $x_i$ is determined by the importance of gene $x_j$ as a feature for the random forest $f_j$. dynGENIE3 uses the Mean Decrease Impurity measure, which is a measure of how much the subtrees have a reduced variance due to a split by feature $x_j$: $$I(n) = |S|\cdot\text{Var}(S) - |S_T|\cdot\text{Var}(S_T) - |S_F|\cdot\text{Var}(S_F)$$ where $S$ is the set of samples that reach node $n$, $S_T$ is the subset of elements that are true according to the split, and $S_F$ is the subset of elements that are false. If a feature is used as a split at multiple nodes, its importance is the average of all individual importance scores.
At steady state, $\frac{dx_j(t)}{dt} = 0 \;\; \forall j$. Thus, the learning sample becomes $$\Big{ \mathbf{x}(t_k), \; \alpha_j x_j (t_k) \Big}_{k=1,\;\cdots,\;N-1}$$. All other parts of the dynGENIE3 algorithm remains the same.
dynUGENE provides several additional functionalities on top of dynGENIE3.
Four datasets are provided, the repressilator and Hodgkin-Huxley and their stochastic counterparts. See more info with ?Repressilator and ?HodgkinHuxley for the deterministic data, and ?StochasticRepressilator and ?StochasticHodgkinHuxley for their stochastic counterparts. Note that the noise added to create the stochastic counterparts are not simply Gaussian noise, but rather the solutions to stochastic differential equations with diagonal noise.
The repressilator is a synthetic gene circuit designed by Elowitz & Leibler (2000). A negative-feedback loop along with similar mRNA and protein degradation rates together makes the system exhibit oscillatory behaviour. The dynamics of the proteins, under the deterministic ODEs below, show a constant period and amplitude after some time.
\begin{equation} \frac{d}{dt}m_1(t) = \alpha_0 + \frac{\alpha}{1 + p_3(t)^n} - m_1(t) \; \; \; \; \; \; \; \; \; \; \; \; \frac{d}{dt}p_1(t) = \beta m_1(t) - \beta p_1(t) \end{equation} \begin{equation} \frac{d}{dt}m_2(t) = \alpha_0 + \frac{\alpha}{1 + p_1(t)^n} - m_2(t) \; \; \; \; \; \; \; \; \; \; \; \; \frac{d}{dt}p_2(t) = \beta m_2(t) - \beta p_2(t) \end{equation} \begin{equation} \frac{d}{dt}m_3(t) = \alpha_0 + \frac{\alpha}{1 + p_2(t)^n} - m_3(t) \; \; \; \; \; \; \; \; \; \; \; \; \frac{d}{dt}p_3(t) = \beta m_3(t) - \beta p_3(t) \end{equation}
where the $m_i$ are the mRNA concentrations and $p_i$ are the protein concentrations. $\alpha_0$ is the basal transcription rate of mRNA, $\beta$ is the decay rate of all three mRNA species and all three protein species, and $n = 2$ is the Hill coefficient.




















Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.