estimate_noiseparameters: Estimates noise in single cell data.
In BEARscc: BEARscc (Bayesian ERCC Assesstment of Robustness of Single Cell Clusters)

Description Usage Arguments Details Value Note Author(s) Examples

Estimates the drop-out model and technical variance from spike-ins present in the sample.

estimate_noiseparameters(SCEList, plot=FALSE, sd_inflate=0, max_cumprob=0.9999,
    bins=10, write.noise.model=TRUE, file="noise_estimation",
    dropout_inflate=1, model_view=c("Observed", "Optimized"),
    alpha_resolution=0.005, tie_function="maximum")

`SCEList`	A `SingleCellExperiment` object that must contain the observed counts matrix as `"observed_expression"` in `assays`, and must have the relevant spike-in samples identified using `isSpike()` as well as contain the expected actual concentrations of these spike-ins as `spikeConcentrations` in `metadata`. Please see the vignette for more detail about constructing the appropriate `SCEList`.
`plot`	When `plot=TRUE` produces plots to investigate quality of data fits with root file name set by `file` option.
`sd_inflate`	An optional parameter to modulate the estimated noise. The estimated standard deviation of spike-ins can be scaled by this factor. We recommend leaving the value at the default of 0.
`bins`	The parameter determines the number of bins for comparison of the quality of fit between the mixed-model and observed data for each spike-in alpha in order to calculate the relationship between alpha and mean in the noise model. This should be set lower for small datasets and higher for datasets with more observations
`max_cumprob`	Because a cumulative distribution will range from n=0 to a countable infinity, the event space needs to be set to cover a reasonable fraction of the probability density. This parameter determines the the fraction of probability density covered by the event space, which in turn defines the highes count number in the event space. We recommend users use the default value of 0.9999.
`write.noise.model`	When `write.noise.model=TRUE` outputs two tab-delimited files containing the dropout effects and noise model parameters; this allows users to apply the noise generation on a seperate high compute node. The root file name is set by `file` option.
`file`	Describes the root name for files written out by `write.noise.model` and `plot` options.
`dropout_inflate`	A scaling parameter for increasing explicitly the number of drop-outs present beyond those estimated by spike-ins. The value must be greater than 0 or an error will occur. Values below one will diminish drop-outs in simulated replicates, and values above one will increase drop-outs in simulated replicates. We recommend users use the default value of 1.
`model_view`	`model_view=c("Observed", "Optimized", "Poisson", "Neg. Binomial"` determines the statistical distributions that should be plotted for the ERCC plots output by `plot=TRUE`.
`alpha_resolution`	Because the alpha parameter is enumerated discretely and empirically evaluated for each value for each spike-in, it is necessary to specify the resolution (how small the step is between each explicit alpha test); this parameter defines the resolution of alpha values tested for maximum empirical fit to spike-ins. It is recommended that users utilize the default resolution.
`tie_function`	The parameter `tie_function=c("minimum", "maximum")` tells BEARscc how to handle a tie alpha value for fitting the mixture model to an individual spike-in. If `maximum`, then BEARscc will chose the maximum alpha value with the best fit; conversely, if `minimum` is set, then BEARscc will choose the minimum alpha value with the best fit.

BEARscc consists of three steps: modelling technical variance based on spike-ins (Step 1); simulating technical replicates (Step 2); and clustering simulated replicates (Step 3). In Step 1, an experiment-specific model of technical variability ("noise") is estimated using observed spike-in read counts. This model consists of two parts. In the first part, expression-dependent variance is approximated by fitting read counts of each spike-in across cells to a mixture model (see Methods). The second part, addresses drop-out effects. Based on the observed drop-out rate for spike-ins of a given concentration, the 'drop-out injection distribution' models the likelihood that a given transcript concentration will result in a drop-out. The 'drop-out recovery distribution' is estimated from the drop-out injection distribution using Bayes' theorem and models the likelihood that a transcript that had no observed counts in a cell was a false negative. This function performs the first step of BEARscc. For further algorithmic detail please refer to our manuscript methods.

The resulting output of estimate_noiseparameters() is another SingleCellExperiment class object; however four new annotations that describe the drop-out and variance models computed by BEARscc have been added to the metadata of the SingleCellExperiment object. Specifically.

`dropout_parameters`	A `data.frame` listing gene-wise parameters necessary for computing drop-oout recovery and injection probabilities in order to define the two drop-out models for zero observation and positive values within the drop-out range by `simulate_replicates()`.
`spikein_parameters`	A `data.frame` of the estimated noise model parameters utilized by `simulate_replicates()` to simulate replicates in non-zero observations.
`genewiseDropouts`	A `data.frame` of the estimated probabilities used in the Bayes' calculation of the probabilities described in `dropout_parameters`. While these are not use in further analysis, they are supplied here for the user's reference.

Frequently, the user will want to compute simulated technical replicates in a high performance computational environment. While the function outputs the necessary information for create_noiseinjected_counts(), with the option write.noise.model=TRUE users are able to save two tab delimited files necessary to run HPC_generate_noise_matrices.R on a high performance computational cluster. The option file is used to indicate the desired root label of the files, "*_bayesianestimates.xls" and "*_parameters4randomize.xls".

In the examples section, the parameter, alpha_resolution is set to 0.25, which is a terrible resolution for estimating noise, but allows the example to run in reasonable to time for checking the help files. We recommend the default parameter: alpha_resolution=0.005.

David T. Severson <david_severson@hms.harvard.edu>

Maintainer: Benjamin Schuster-Boeckler <benjamin.schuster-boeckler@ludwig.ox.ac.uk>

library("SingleCellExperiment")
data("BEARscc_examples")

#For execution on local machine
BEAR_examples.sce <- estimate_noiseparameters(BEAR_examples.sce,
    alpha_resolution=0.25, write.noise.model=FALSE)
BEAR_examples.sce

#To save results as files for abnalysis on a
#high performance computational cluster
estimate_noiseparameters(BEAR_examples.sce, write.noise.model=TRUE,
    alpha_resolution=0.25, file="noise_estimation",
    model_view=c("Observed","Optimized"))

BEARscc documentation built on Nov. 8, 2020, 7:56 p.m.

BEARscc index

Package overview README.md Vignette Title BEARscc: Using spike-ins to assess single cell cluster robustness

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

BEARscc
BEARscc (Bayesian ERCC Assesstment of Robustness of Single Cell Clusters)

estimate_noiseparameters: Estimates noise in single cell data.
In BEARscc: BEARscc (Bayesian ERCC Assesstment of Robustness of Single Cell Clusters)

Description

Usage

Arguments

Details

Value

Note

Author(s)

Examples

Related to estimate_noiseparameters in BEARscc...

R Package Documentation

Browse R Packages

We want your feedback!

BEARscc BEARscc (Bayesian ERCC Assesstment of Robustness of Single Cell Clusters)

estimate_noiseparameters: Estimates noise in single cell data. In BEARscc: BEARscc (Bayesian ERCC Assesstment of Robustness of Single Cell Clusters)

Description

Usage

Arguments

Details

Value

Note

Author(s)

Examples

Related to estimate_noiseparameters in BEARscc...

R Package Documentation

Browse R Packages

We want your feedback!

BEARscc
BEARscc (Bayesian ERCC Assesstment of Robustness of Single Cell Clusters)

estimate_noiseparameters: Estimates noise in single cell data.
In BEARscc: BEARscc (Bayesian ERCC Assesstment of Robustness of Single Cell Clusters)