analyze.toycluster: Analysis of toyclusters in stochastic profiling model

Description Usage Arguments Details Value Author(s) References

View source: R/analyze.toycluster.R

Description

Estimation of the model parameters for the 12-gene toyclusters provided in this package. This is done in three steps: an optional preanalysis of the single genes, an analysis of three 4-gene subclusters, and finally the analysis of the entire 12-gene cluster.

Usage

1
2
analyze.toycluster(model = "LN-LN", data.model = "LN-LN", TY = 2,
   preanalyze = T, show.plots = T, use.constraints = F)

Arguments

model

model for which one wishes to estimate the parameters: "LN-LN", "rLN-LN" or "EXP-LN"

data.model

model which has generated the 12-gene dataset: "LN-LN", "rLN-LN" or "EXP-LN"

TY

number of types of cells that is assumed in the stochastic model

preanalyze

if TRUE, the single-gene preanalysis as described below is carried out

show.plots

if TRUE, interim results are graphically displayed. This requires the user to confirm each new plot.

use.constraints

if TRUE, constraints on the individual population densities are applied; see penalty.constraint.LNLN, penalty.constraint.rLNLN and
penalty.constraint.EXPLN for details.

Details

This function carries out estimation of the model parameters for the toycluster.LNLN,
toycluster.rLNLN or toycluster.EXPLN dataset. This contains perfectly observed measurements for 12 genes and 16 tissue samples, assuming 10-cell samplings and two different types of cells. The true underlying parameters are given on the help page for the datasets.

The estimation is performed in three steps:

In an optional preanalysis (carried out if preanalyze is TRUE), each gene is considered individually, i.e. for each gene the parameters are estimated (these are p, mu_1, mu_2 and sigma for LN-LN, p, mu_1, mu_2, sigma_1 and sigma_2 for rLN-LN, and p, mu, sigma and lambda for EXP-LN). This gives a rough idea about the location of the parameters at computationally low cost. This might speed up the analysis of the larger clusters. From the confidence intervals of the single-gene estimates, one can construct appropriate parameter ranges for the following step.

In the main step of the estimation procedure, the 12 genes are divided into three groups of size four. This is because the stochastic profiling model for 12 genes involves 48 (LN-LN and EXP-LN) to 49 (rLN-LN) parameters, which is computationally expensive and sometimes unreliable. Simulation studies showed that datasets comprising four genes are sufficient to estimate the log-means when there is data from 16 experiments available. For each of these 4-gene clusters, 10 (LN-LN and EXP-LN) or 11 (rLN-LN) parameters are estimated. The three groups result from a hierarchical clustering of the entire dataset. The genes numbers are (7,5,2,8), (1,3,4,10) and (9,6,12,11) for the LN-LN model, (12,9,6,11), (4,10,5,3) and (1,7,8,2) for the rLN-LN model and (11,1,10,9), (3,5,8,7) and (4,2,12,6) for the EXP-LN model.

In the final step, the log-means mu are fixed to the maximum likelihood estimates that resulted from the main step. Then there remain only p, sigma and possibly lambda to be estimated. These are inferred now.

Throughout the whole estimation process, interim results are printed into the console and, if
show.plots is TRUE, graphically displayed.

Value

The final result for the chosen 12-gene cluster. That is a list as returned by stochprof.loop, i.e. the following components:

mle

maximum likelihood estimate

neg-loglikeli

value of the negative log-likelihood function at maximum likelihood estimate

ci

approximate marginal maximum likelihood confidence intervals for the maximum likelihood estimate

pargrid

matrix containing parameter combinations and according values of the target function

bic

Bayesian information criterion value

adj.bic

adjusted Bayesian information criterion value which takes into account the numbers of parameters that were estimated during the preanalysis of the gene cluster

pen

penalization for densities not fulfilling required constraints. If use.constraints is FALSE, this has no practical meaning. If use.constraints is TRUE, this value is included in loglikeli.

Author(s)

Lisa Amrhein, Christiane Fuchs

Maintainer: Lisa Amrhein <amrheinlisa@gmail.com>

References

"Parameterizing cell-to-cell regulatory heterogeneities via stochastic transcriptional profiles" by Sameer S Bajikar*, Christiane Fuchs*, Andreas Roller, Fabian J Theis^ and Kevin A Janes^: PNAS 2014, 111(5), E626-635 (* joint first authors, ^ joint last authors) <doi:10.1073/pnas.1311647111>

"Pheno-seq - linking visual features and gene expression in 3D cell culture systems" by Stephan M. Tirier, Jeongbin Park, Friedrich Preusser, Lisa Amrhein, Zuguang Gu, Simon Steiger, Jan-Philipp Mallm, Teresa Krieger, Marcel Waschow, Bjoern Eismann, Marta Gut, Ivo G. Gut, Karsten Rippe, Matthias Schlesner, Fabian Theis, Christiane Fuchs, Claudia R. Ball, Hanno Glimm, Roland Eils & Christian Conrad: Sci Rep 9, 12367 (2019) <doi:10.1038/s41598-019-48771-4>


stochprofML documentation built on July 1, 2020, 5:18 p.m.