View source: R/residual_transform.R
residual_transform  R Documentation 
Fit an intercept GammaPoisson model that corrects for sequencing depth and return the residuals as variance stabilized results for further downstream application, for which no proper countbased method exist or is performant enough (e.g., clustering, dimensionality reduction).
residual_transform(
data,
residual_type = c("randomized_quantile", "pearson", "analytic_pearson"),
clipping = FALSE,
overdispersion = 0.05,
size_factors = TRUE,
offset_model = TRUE,
overdispersion_shrinkage = TRUE,
ridge_penalty = 2,
on_disk = NULL,
return_fit = FALSE,
verbose = FALSE,
...
)
data 
any matrixlike object (e.g. matrix, dgCMatrix, DelayedArray, HDF5Matrix)
with one column per sample and row per gene. It can also be an object of type 
residual_type 
a string that specifies what kind of residual is returned as variance stabilizedvalue.
The two above options are the most common choices, however you can use any 
clipping 
a single boolean or numeric value specifying that all residuals are in the range

overdispersion 
the simplest count model is the Poisson model. However, the Poisson model
assumes that
Note that 
offset_model 
boolean to decide if 
overdispersion_shrinkage, size_factors 
arguments that are passed to the underlying
call to 
ridge_penalty 
another argument that is passed to 
on_disk 
a boolean that indicates if the dataset is loaded into memory or if it is kept on disk
to reduce the memory usage. Processing in memory can be significantly faster than on disk.
Default: 
return_fit 
boolean to decide if the matrix of residuals is returned directly ( 
verbose 
boolean that decides if information about the individual steps are printed.
Default: 
... 
additional parameters passed to 
Internally, this method uses the glmGamPoi
package. The function goes through the following steps
fit model using glmGamPoi::glm_gp()
plug in the trended overdispersion estimates
call glmGamPoi::residuals.glmGamPoi()
to calculate the residuals.
a matrix (or a vector if the input is a vector) with the transformed values. If return_fit = TRUE
,
a list is returned with two elements: fit
and Residuals
.
AhlmannEltze, Constantin, and Wolfgang Huber. "glmGamPoi: Fitting GammaPoisson Generalized Linear Models on Single Cell Count Data." Bioinformatics (2020)
Dunn, Peter K., and Gordon K. Smyth. "Randomized quantile residuals." Journal of Computational and Graphical Statistics 5.3 (1996): 236244.
Hafemeister, Christoph, and Rahul Satija. "Normalization and variance stabilization of singlecell RNAseq data using regularized negative binomial regression." Genome biology 20.1 (2019): 115.
Hafemeister, Christoph, and Rahul Satija. "Analyzing scRNAseq data with the sctransform and offset models." (2020)
Lause, Jan, Philipp Berens, and Dmitry Kobak. "Analytic Pearson residuals for normalization of singlecell RNAseq UMI data." Genome Biology (2021).
glmGamPoi::glm_gp()
, glmGamPoi::residuals.glmGamPoi()
, sctransform::vst()
,
statmod::qresiduals()
# Load a single cell dataset
sce < TENxPBMCData::TENxPBMCData("pbmc4k")
# Reduce size for this example
set.seed(1)
sce_red < sce[sample(which(rowSums2(counts(sce)) > 0), 1000),
sample(ncol(sce), 200)]
counts(sce_red) < as.matrix(counts(sce_red))
# Residual Based Variance Stabilizing Transformation
rq < residual_transform(sce_red, residual_type = "randomized_quantile",
verbose = TRUE)
pearson < residual_transform(sce_red, residual_type = "pearson", verbose = TRUE)
# Plot first two principal components
pearson_pca < prcomp(t(pearson), rank. = 2)
rq_pca < prcomp(t(rq), rank. = 2)
plot(rq_pca$x, asp = 1)
points(pearson_pca$x, col = "red")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.