Combine the results of multiple variance decompositions, usually generated for the same genes across separate batches of cells.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
Two or more DataFrames of variance modelling results.
Alternatively, one or more lists of DataFrames containing variance modelling results. Mixed inputs are also acceptable, e.g., lists of DataFrames alongside the DataFrames themselves.
String specifying how p-values are to be combined, see
A string specifying the column name of each element of
A character vector specifying the fields containing other statistics to combine.
Logical scalar indicating whether each result is to be given equal weight in the combined statistics.
Numeric vector containing the number of cells used to generate each element of
These functions are designed to merge results from separate calls to
modelGeneCV2 or related functions, where each result is usually computed for a different batch of cells.
Separate variance decompositions are necessary in cases where the mean-variance relationships vary across batches (e.g., different concentrations of spike-in have been added to the cells in each batch), which precludes the use of a common trend fit.
By combining these results into a single set of statistics, we can apply standard strategies for feature selection in multi-batch integrated analyses.
By default, statistics in
other.fields contain all common non-numeric fields that are not
This usually includes
combineVar, statistics are combined by averaging them across all input DataFrames.
combineCV2, statistics are combined by taking the geometric mean across all inputs.
This difference between functions reflects the method by which the relevant measure of overdispersion is computed.
"bio" is computed by subtraction, so taking the average
bio remains consistent with subtraction of the total and technical averages.
"ratio" is computed by division, so the combined
ratio is consistent with division of the geometric means of the total and trend values.
equiweight=FALSE, each per-batch statistic is weighted by the number of cells used to compute it.
The number of cells can be explicitly set using
ncells, and is otherwise assumed to be equal for all batches.
No weighting is performed by default, which ensures that all batches contribute equally to the combined statistics and avoids situations where batches with many cells dominate the output.
combinePValues function is used to combine p-values across batches.
method="z" will perform any weighting of batches, and only if
weights is set.
A DataFrame with the same numeric fields as that produced by
Each row corresponds to an input gene.
Each field contains the (weighted) arithmetic/geometric mean across all batches except for
p.value, which contains the combined p-value based on
FDR, which contains the adjusted p-value using the BH method.
combinePValues, for details on how the p-values are combined.
1 2 3 4 5 6 7 8 9 10 11 12 13 14
library(scuttle) sce <- mockSCE() y1 <- sce[,1:100] y1 <- logNormCounts(y1) # normalize separately after subsetting. results1 <- modelGeneVar(y1) y2 <- sce[,1:100 + 100] y2 <- logNormCounts(y2) # normalize separately after subsetting. results2 <- modelGeneVar(y2) head(combineVar(results1, results2)) head(combineVar(results1, results2, method="simes")) head(combineVar(results1, results2, method="berger"))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.