pbWeights: Compute precision weights for pseudobulk

View source: R/initialPrecisionWeights.R

pbWeightsR Documentation

Compute precision weights for pseudobulk

Description

Compute precision weights for pseudobulk using the delta method to approximate the variance of the log2 counts per million considering variation in the number of cells and gene expression variance across cells within each sample. By default, used number of cells; if specified use delta method. Note that processAssays() uses number of cells as weights when no weights are specificed

Usage

pbWeights(
  sce,
  sample_id,
  cluster_id,
  geneList = NULL,
  method = c("delta", "ncells"),
  shrink = TRUE,
  prior.count = 0.5,
  maxRatio = 20,
  h5adBlockSizes = 1e+09,
  details = FALSE,
  verbose = TRUE
)

Arguments

sce

SingleCellExperiment of where counts(sce) stores the raw count data at the single cell level

sample_id

character string specifying which variable to use as sample id

cluster_id

character string specifying which variable to use as cluster id

geneList

list of genes to be included for each cell type

method

select method to compute precision weights. 'delta' use the delta method based on normal approximation to a negative binomial model, slower but can increase power. 'ncells' use the number of cells, this is faster; Subsequent arguments are ignored. Included for testing

shrink

Defaults to TRUE. Use empirical Bayes variance shrinkage from limma to shrink estimates of expression variance across cells within each sample

prior.count

Defaults to 0.5. Count added to each observation at the pseudobulk level. This is scaled but the number of cells before added to the cell level

maxRatio

When computing precision as the reciprocal of variance 1/(x+tau) select tau to have a maximum ratio between the largest and smallest precision

h5adBlockSizes

set the automatic block size block size (in bytes) for DelayedArray to read an H5AD file. Larger values use more memory but are faster.

details

include data.frame of cell-level statistics as attr(., "details")

verbose

Show messages, defaults to TRUE

Examples

library(muscat)

data(example_sce)

# create pseudobulk for each sample and cell cluster
pb <- aggregateToPseudoBulk(example_sce,
  assay = "counts",
  sample_id = "sample_id",
  cluster_id = "cluster_id",
  verbose = FALSE
) 

# Gene expressed genes for each cell type
geneList = getExprGeneNames(pb)

# Create precision weights for pseudobulk
# By default, weights are set to cell count,
# which is the default in processAssays()
# even when no weights are specified
weightsList <- pbWeights(example_sce,
  sample_id = "sample_id",
  cluster_id = "cluster_id",
  geneList = geneList
)

# voom-style normalization using initial weights
res.proc <- processAssays(pb, ~group_id, weightsList = weightsList)

GabrielHoffman/dreamlet documentation built on May 20, 2024, 2:05 p.m.