knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-"
)

transformGamPoi

R package that accompanies our paper 'Comparison of transformations for single-cell RNA-seq data ' (https://www.nature.com/articles/s41592-023-01814-1).

transformGamPoi provides methods to stabilize the variance of single cell count data:

Installation

You can install the current development version of transformGamPoi by typing the following into the R console:

# install.packages("devtools")
devtools::install_github("const-ae/transformGamPoi")

The installation should only take a few seconds and work across all major operating systems (MacOS, Linux, Windows).

Example

Let's compare the different variance-stabilizing transformations.

We start by loading the transformGamPoi package and setting a seed to make sure the results are reproducible.

library(transformGamPoi)
set.seed(1)

We then load some example data, which we subset to 1000 genes and 500 cells

sce <- TENxPBMCData::TENxPBMCData("pbmc4k")
sce_red <- sce[sample(which(rowSums2(counts(sce)) > 0), 1000),
               sample(ncol(sce), 500)]

We calculate the different variance-stabilizing transformations. We can either use the generic transformGamPoi() method and specify the transformation, or we use the low-level functions acosh_transform(), shifted_log_transform(), and residual_transform() which provide more settings. All functions return a matrix, which we can for example insert back into the SingleCellExperiment object:

assay(sce_red, "acosh") <- transformGamPoi(sce_red, transformation = "acosh")
assay(sce_red, "shifted_log") <- shifted_log_transform(sce_red, overdispersion = 0.1)
# For large datasets, we can also do the processing without 
# loading the full dataset into memory (on_disk = TRUE)
assay(sce_red, "rand_quant") <- residual_transform(sce_red, "randomized_quantile", on_disk = FALSE)
assay(sce_red, "pearson") <- residual_transform(sce_red, "pearson", clipping = TRUE, on_disk = FALSE)

Finally, we compare the variance of the genes after transformation using a scatter plot

par(pch = 20, cex = 1.15)
mus <- rowMeans2(counts(sce_red))
plot(mus, rowVars(assay(sce_red, "acosh")), log = "x", col = "#1b9e77aa", cex = 0.6,
     xlab =  "Log Gene Means", ylab = "Variance after transformation")
points(mus, rowVars(assay(sce_red, "shifted_log")), col = "#d95f02aa", cex = 0.6)
points(mus, rowVars(assay(sce_red, "pearson")), col = "#7570b3aa", cex = 0.6)
points(mus, rowVars(assay(sce_red, "rand_quant")), col = "#e7298aaa", cex = 0.6)
legend("topleft", legend = c("acosh", "shifted log", "Pearson Resid.", "Rand. Quantile Resid."),
       col = c("#1b9e77", "#d95f02", "#7570b3", "#e7298a"), pch = 16)

See also

There are a number of preprocessing methods and packages out there. Of particular interests are

Session Info

sessionInfo()


const-ae/transformGamPoi documentation built on April 14, 2023, 11:33 p.m.