analyze.toycluster: Analysis of toyclusters in stochastic profiling model
In stochprofML: Stochastic Profiling using Maximum Likelihood Estimation

Description Usage Arguments Details Value Author(s) References

Estimation of the model parameters for the 12-gene toyclusters provided in this package. This is done in three steps: an optional preanalysis of the single genes, an analysis of three 4-gene subclusters, and finally the analysis of the entire 12-gene cluster.

1 2	analyze.toycluster(model = "LN-LN", data.model = "LN-LN", TY = 2, preanalyze = T, show.plots = T, use.constraints = F)

`model`	model for which one wishes to estimate the parameters: "LN-LN", "rLN-LN" or "EXP-LN"
`data.model`	model which has generated the 12-gene dataset: "LN-LN", "rLN-LN" or "EXP-LN"
`TY`	number of types of cells that is assumed in the stochastic model
`preanalyze`	if TRUE, the single-gene preanalysis as described below is carried out
`show.plots`	if TRUE, interim results are graphically displayed. This requires the user to confirm each new plot.
`use.constraints`	if TRUE, constraints on the individual population densities are applied; see `penalty.constraint.LNLN`, `penalty.constraint.rLNLN` and `penalty.constraint.EXPLN` for details.

This function carries out estimation of the model parameters for the toycluster.LNLN,
toycluster.rLNLN or toycluster.EXPLN dataset. This contains perfectly observed measurements for 12 genes and 16 tissue samples, assuming 10-cell samplings and two different types of cells. The true underlying parameters are given on the help page for the datasets.

The estimation is performed in three steps:

In an optional preanalysis (carried out if preanalyze is TRUE), each gene is considered individually, i.e. for each gene the parameters are estimated (these are p, mu_1, mu_2 and sigma for LN-LN, p, mu_1, mu_2, sigma_1 and sigma_2 for rLN-LN, and p, mu, sigma and lambda for EXP-LN). This gives a rough idea about the location of the parameters at computationally low cost. This might speed up the analysis of the larger clusters. From the confidence intervals of the single-gene estimates, one can construct appropriate parameter ranges for the following step.

In the main step of the estimation procedure, the 12 genes are divided into three groups of size four. This is because the stochastic profiling model for 12 genes involves 48 (LN-LN and EXP-LN) to 49 (rLN-LN) parameters, which is computationally expensive and sometimes unreliable. Simulation studies showed that datasets comprising four genes are sufficient to estimate the log-means when there is data from 16 experiments available. For each of these 4-gene clusters, 10 (LN-LN and EXP-LN) or 11 (rLN-LN) parameters are estimated. The three groups result from a hierarchical clustering of the entire dataset. The genes numbers are (7,5,2,8), (1,3,4,10) and (9,6,12,11) for the LN-LN model, (12,9,6,11), (4,10,5,3) and (1,7,8,2) for the rLN-LN model and (11,1,10,9), (3,5,8,7) and (4,2,12,6) for the EXP-LN model.

In the final step, the log-means mu are fixed to the maximum likelihood estimates that resulted from the main step. Then there remain only p, sigma and possibly lambda to be estimated. These are inferred now.

Throughout the whole estimation process, interim results are printed into the console and, if
show.plots is TRUE, graphically displayed.

The final result for the chosen 12-gene cluster. That is a list as returned by stochprof.loop, i.e. the following components:

`mle`	maximum likelihood estimate
`neg-loglikeli`	value of the negative log-likelihood function at maximum likelihood estimate
`ci`	approximate marginal maximum likelihood confidence intervals for the maximum likelihood estimate
`pargrid`	matrix containing parameter combinations and according values of the target function
`bic`	Bayesian information criterion value
`adj.bic`	adjusted Bayesian information criterion value which takes into account the numbers of parameters that were estimated during the preanalysis of the gene cluster
`pen`	penalization for densities not fulfilling required constraints. If `use.constraints` is FALSE, this has no practical meaning. If `use.constraints` is TRUE, this value is included in `loglikeli`.

Lisa Amrhein, Christiane Fuchs

Maintainer: Lisa Amrhein <amrheinlisa@gmail.com>

"Parameterizing cell-to-cell regulatory heterogeneities via stochastic transcriptional profiles" by Sameer S Bajikar*, Christiane Fuchs*, Andreas Roller, Fabian J Theis^ and Kevin A Janes^: PNAS 2014, 111(5), E626-635 (* joint first authors, ^ joint last authors) <doi:10.1073/pnas.1311647111>

"Pheno-seq - linking visual features and gene expression in 3D cell culture systems" by Stephan M. Tirier, Jeongbin Park, Friedrich Preusser, Lisa Amrhein, Zuguang Gu, Simon Steiger, Jan-Philipp Mallm, Teresa Krieger, Marcel Waschow, Bjoern Eismann, Marta Gut, Ivo G. Gut, Karsten Rippe, Matthias Schlesner, Fabian Theis, Christiane Fuchs, Claudia R. Ball, Hanno Glimm, Roland Eils & Christian Conrad: Sci Rep 9, 12367 (2019) <doi:10.1038/s41598-019-48771-4>

stochprofML documentation built on July 1, 2020, 5:18 p.m.