zinbwave: Perform dimensionality reduction using a ZINB regression...

zinbwaveR Documentation

Perform dimensionality reduction using a ZINB regression model with gene and cell-level covariates.

Description

Given an object with the data, it performs dimensionality reduction using a ZINB regression model with gene and cell-level covariates.

Usage

zinbwave(Y, ...)

## S4 method for signature 'SummarizedExperiment'
zinbwave(
  Y,
  X,
  V,
  K = 2,
  fitted_model,
  which_assay,
  which_genes,
  commondispersion = TRUE,
  zeroinflation = TRUE,
  verbose = FALSE,
  nb.repeat.initialize = 2,
  maxiter.optimize = 25,
  stop.epsilon.optimize = 1e-04,
  BPPARAM = BiocParallel::bpparam(),
  normalizedValues = FALSE,
  residuals = FALSE,
  imputedValues = FALSE,
  observationalWeights = FALSE,
  ...
)

Arguments

Y

The data (genes in rows, samples in columns). Currently implemented only for SummarizedExperiment.

...

Additional parameters to describe the model, see zinbModel.

X

The design matrix containing sample-level covariates, one sample per row. If missing, X will contain only an intercept. If Y is a SummarizedExperiment object, X can be a formula using the variables in the colData slot of Y.

V

The design matrix containing gene-level covariates, one gene per row. If missing, V will contain only an intercept. If Y is a SummarizedExperiment object, V can be a formula using the variables in the rowData slot of Y.

K

integer. Number of latent factors. Specify K = 0 if only computing observational weights.

fitted_model

a ZinbModel object.

which_assay

numeric or character. Which assay of Y to use. If missing, if 'assayNames(Y)' contains "counts" then that is used. Otherwise, the first assay is used.

which_genes

character. Which genes to use to estimate W (see details). Ignored if fitted_model is provided.

commondispersion

Whether or not a single dispersion for all features is estimated (default TRUE).

zeroinflation

Whether or not a ZINB model should be fitted. If FALSE, a negative binomial model is fitted instead.

verbose

Print helpful messages.

nb.repeat.initialize

Number of iterations for the initialization of beta_mu and gamma_mu.

maxiter.optimize

maximum number of iterations for the optimization step (default 25).

stop.epsilon.optimize

stopping criterion in the optimization step, when the relative gain in likelihood is below epsilon (default 0.0001).

BPPARAM

object of class bpparamClass that specifies the back-end to be used for computations. See bpparam for details.

normalizedValues

indicates wether or not you want to compute normalized values for the counts after adjusting for gene and cell-level covariates.

residuals

indicates wether or not you want to compute the residuals of the ZINB model. Deviance residuals are computed.

imputedValues

indicates wether or not you want to compute the imputed counts of the ZINB model.

observationalWeights

indicates whether to compute the observational weights for differential expression (see vignette).

Details

For visualization (heatmaps, ...), please use the normalized values. It corresponds to the deviance residuals when the W is not included in the model but the gene and cell-level covariates are. As a results, when W is not included in the model, the deviance residuals should capture the biology. Note that we do not recommend to use the normalized values for any downstream analysis (such as clustering, or differential expression), but only for visualization.

If one has already fitted a model using ZinbModel, the object containing such model can be used as input of zinbwave to save the resulting W into a SummarizedExperiment and optionally compute residuals and normalized values, without the need for re-fitting the model.

By default zinbwave uses all genes to estimate W. However, we recommend to use the top 1,000 most variable genes for this step. In general, a user can specify any custom set of genes to be used to estimate W, by specifying either a vector of gene names, or a single character string corresponding to a column of the rowData.

Note that if both which_genes is specified and at least one among observationalWeights, imputedValues, residuals, and normalizedValues is TRUE, the model needs to be fit twice.

Value

An object of class SingleCellExperiment; the dimensionality reduced matrix is stored in the reducedDims slot and optionally normalized values and residuals are added in the list of assays.

Methods (by class)

  • zinbwave(SummarizedExperiment): Y is a SummarizedExperiment.

Examples

se <- SingleCellExperiment(assays = list(counts = matrix(rpois(60, lambda=5),
                                                         nrow=10, ncol=6)),
                           colData = data.frame(bio = gl(2, 3)))

m <- zinbwave(se, X="~bio", BPPARAM=BiocParallel::SerialParam())

drisso/zinbwave documentation built on March 18, 2024, 5:13 p.m.