# Attach the DeLorean data frame to access members attach(dl)

The standard deviation of cell pseudotimes around the cell capture times has been set as
$$
\sigma_\tau = `r opts$sigma.tau`

$$
The length scale of the gene expression profiles has been set as
$$
l = `r hyper$l`

$$

We examine the variance both between- and within-time points. We expect some of the variation within a time point to be associated with noise in the temporal dimension, that is estimating the pseudotime correctly should reduce the within-time variance. The rest of the variation within a time point will be due to noise. Conversely we expect most of the variation between time points to be due to temporal variation although some will be due to noise.

Let
$$
k_c \in {\kappa_1, \dots, \kappa_T}
$$
be the time point at which cell $c$ was captured. We partition the cells
by their captured time points indexed by $1 \le t \le T$:
$$
\mathcal{K}*t = {c: k_c = \kappa_t}
$$
We group the expression measurements by gene and observed capture time to
calculate means and variances:
$$
\begin{align}
\mathbb{M}*{g,t} &= \text{Mean}*{c \in \mathcal{K}_t}{x*{g,c}} \\
\mathbb{V}*{g,t} &= \text{Var}*{c \in \mathcal{K}*t}{x*{g,c}}
\end{align}
$$

We estimate the gene-specific noise levels by assuming that all the within-time
variation in the data is due to noise. This should be a slight overestimate as
some variation will be due to noise in the pseudotemporal dimension.
$$
\hat{\omega}*g = \textrm{Mean}_t{\mathbb{V}*{g,t}}
$$
Giving us this fit for our empirical Bayes prior on the $\log \hat{\omega}_g$

(ggplot(gene.var, aes(x=log(omega.hat))) + geom_density() + geom_rug() + stat_function(fun=function(x) dnorm(x, mean=hyper$mu_omega, sd=hyper$sigma_omega), colour="blue", alpha=.7, linetype="dashed") )

We estimate the temporal variation by calculating the expected variance of samples
at the capture times from the expression profile of a gene $g$. These depend on
our fixed length scale and unknown temporal variance $\psi_g$. The
covariance of these samples will be $\psi_g \hat{\Sigma}$ where
$$
\hat{\Sigma}*{t_1,t_2} = \Sigma*\tau(\kappa_{t_1}, \kappa_{t_2})
$$
The expected variance of our samples is $\psi_g V_{\hat{\Sigma}}$ where
$$
V_{\hat{\Sigma}} = \textrm{Mean}{\textrm{Diag}(\hat{\Sigma})} - \textrm{Mean}{\hat{\Sigma}}
$$
and we can slightly overestimate the temporal variances $\psi_g$ by ignoring the
noise in our data
$$
\hat{\psi}*g = \frac{\textrm{Var}_t{\mathbb{M}*{g,t}}}{V_{\hat{\Sigma}}}
$$
Giving us this fit for our empirical Bayes prior on the temporal variation
$\log \hat{\psi}_g$

(ggplot(gene.var, aes(x=log(psi.hat))) + geom_density() + geom_rug() + stat_function(fun=function(x) dnorm(x, mean=hyper$mu_psi, sd=hyper$sigma_psi), colour="blue", alpha=.7, linetype="dashed") )

The correlation between the temporal variation and noise estimates is
`r with(gene.var, round(cor(psi.hat, omega.hat), digits=2))`

.
Plotting the estimated noise levels against temporal variation gives:

( ggplot(gene.var, aes(x=log10(psi.hat), y=log10(omega.hat))) + geom_point() + geom_abline(intercept=0, slope=1, linetype="dashed", alpha=.7) )

How big are the cell size estimates relative to the average variation in the gene expression? The line represents the square root of the mean variance of the gene expressions.

gene.vars <- apply(expr, 1, var) mean.gene.variation <- sqrt(mean(gene.vars)) ggplot(cell.sizes, aes(x=S.hat)) + geom_histogram() + geom_vline(xintercept=mean.gene.variation)

# Detach the previously attached DeLorean data frame detach(dl)

**Any scripts or data that you put into this service are public.**

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.