stochprof.loop: Maximum likelihood estimation for the parameters in the...

Description Usage Arguments Details Value Author(s) References

View source: R/stochprof.loop.R

Description

Maximum likelihood estimation for the parameters in the stochastic profiling model. Because the log-likelihood function is potentially multimodal, no straightforward use of gradient-based approaches for finding globally optimal parameter combinations is possible. To tackle this challenge, this function performs a two-step estimation procedure.

Usage

1
2
3
4
stochprof.loop(model, dataset, n, TY, genenames = NULL, fix.mu = F,
   fixed.mu, par.range = NULL, prev.result = NULL, loops = 5,
   until.convergence = T, print.output = T, show.plots = T,
   plot.title = "", pdf.file, use.constraints = F, subgroups)

Arguments

model

model for which one wishes to estimate the parameters: "LN-LN", "rLN-LN" or "EXP-LN"

dataset

matrix which contains the cumulated expression data over all cells in a tissue sample. Columns represent different genes, rows represent different tissue samples.

n

number of cells taken from each tissue sample. This can also be a vector stating how many cells are in each sample seperatly.

TY

number of types of cells that is assumed in the stochastic model

genenames

names of the genes in the dataset. For genenames==NULL, the genes will simply be enumerated according to the column numbers in the dataset.

fix.mu

if TRUE, the log-means of the lognormal distributions are kept fixed in the estimation procedure. Otherwise, they are to be estimated.

fixed.mu

vector containing the values to which the log-means should be fixed if fix.mu==T. The order of components is as follows:

(mu_type_1_gene_1, mu_type_1_gene_2, ...,

mu_type_2_gene_1, mu_type_2_gene_2, ...).

This argument needs to be specified only when fix.mu==T.

par.range

range from which the parameter values should be randomly drawn if there is no knowledge from previous iterations of the search algorithm available. This is a matrix with the number of rows being equal to the number of model parameters. The first column contains the lower bound, the second column the upper bound. If par.range==NULL, some rather large range is defined.

prev.result

can contain results from former calls of this function

loops

maximal number of loops carried out in the estimation procedure. Each loops involves various methods to determine the high-likelihood region.

until.convergence

if TRUE, the estimation process is terminated if there had been no improvement concerning the value of the target function between two consecutive loops. Otherwise, the estimation procedure is terminated according to the parameter "loops".

print.output

if TRUE, interim results of the grid search and numerical optimization are printed into the console throughout the estimation procedure

show.plots

if TRUE, profile log-likelihood plots are displayed at the end of the estimation procedure

plot.title

title of each plot if show.plots==T

pdf.file

optional filename. If this is not missing and show.plots==T, the profile log-likelihoods will be plotted into this file.

use.constraints

if TRUE, constraints on the individual population densities are applied; see penalty.constraint.LNLN, penalty.constraint.rLNLN and
penalty.constraint.EXPLN for details.

subgroups

list of sets of gene numbers. This parameter should be given only when the present call of stochprof.loop is based on a subanalysis of the subgroups of genes with non-fixed mu. The parameter is used only for calculation of the adjusted BIC which takes into account the number of parameters that had to be estimated during the whole estimation procedure: First, for each of the subclusters, and then for the final analysis.

Details

This function carries out maximum likelihood estimation for the parameters of the stochastic profiling model. Because the log-likelihood function is potentially multimodal, no straightforward use of gradient-based approaches for finding globally optimal parameter combinations is possible. To tackle this challenge, this function performs a two-step estimation procedure: First, it computes the log-likelihood function at randomly drawn parameter combinations to identify high-likelihood regions in parameter space at computationally low cost. Then, it uses the Nelder-Mead algorithm to identify local maxima of the likelihood function. The starting values for this algorithm are randomly drawn from the high-likelihood regions determined in the first step. To further localize the global optimum, the function again performs grid searches of the parameter space, this time around the optimum identified by the Nelder-Mead algorithm. This search creates another space to identify high-likelihood regions, which are then used to seed another Nelder-Mead optimization.

Value

A list with the following components:

mle

maximum likelihood estimate

neg-loglikeli

value of the negative log-likelihood function at maximum likelihood estimate

ci

approximate marginal maximum likelihood confidence intervals for the maximum likelihood estimate

pargrid

matrix containing parameter combinations and according values of the target function

bic

Bayesian information criterion value

adj.bic

adjusted Bayesian information criterion value which takes into account the numbers of parameters that were estimated during the preanalysis of a gene cluster. Is only calculated if parameter subgroups is given, otherwise set to NULL.

pen

penalization for densities not fulfilling required constraints. If use.constraints is FALSE, this has no practical meaning. If use.constraints is TRUE, this value is included in loglikeli.

Author(s)

Lisa Amrhein, Christiane Fuchs

Maintainer: Lisa Amrhein <amrheinlisa@gmail.com>

References

"Parameterizing cell-to-cell regulatory heterogeneities via stochastic transcriptional profiles" by Sameer S Bajikar*, Christiane Fuchs*, Andreas Roller, Fabian J Theis^ and Kevin A Janes^: PNAS 2014, 111(5), E626-635 (* joint first authors, ^ joint last authors) <doi:10.1073/pnas.1311647111>

"Pheno-seq - linking visual features and gene expression in 3D cell culture systems" by Stephan M. Tirier, Jeongbin Park, Friedrich Preusser, Lisa Amrhein, Zuguang Gu, Simon Steiger, Jan-Philipp Mallm, Teresa Krieger, Marcel Waschow, Bjoern Eismann, Marta Gut, Ivo G. Gut, Karsten Rippe, Matthias Schlesner, Fabian Theis, Christiane Fuchs, Claudia R. Ball, Hanno Glimm, Roland Eils & Christian Conrad: Sci Rep 9, 12367 (2019) <doi:10.1038/s41598-019-48771-4>


lisaamrhein/stochprofML documentation built on Dec. 25, 2021, 9:02 p.m.