Compute the distance-to-median statistic for the CV2 residuals of all genes
A numeric vector of average counts for each gene.
A numeric vector of squared coefficients of variation for each gene.
An integer scalar specifying the window size for median-based smoothing. This should be odd or will be incremented by 1.
This function will compute the distance-to-median (DM) statistic described by Kolodziejczyk et al. (2015).
Briefly, a median-based trend is fitted to the log-transformed
cv2 against the log-transformed
The DM is defined as the residual from the trend for each gene.
This statistic is a measure of the relative variability of each gene, after accounting for the empirical mean-variance relationship.
Highly variable genes can then be identified as those with high DM values.
A numeric vector of DM statistics for all genes.
Jong Kyoung Kim, with modifications by Aaron Lun
Kolodziejczyk AA, Kim JK, Tsang JCH et al. (2015). Single cell RNA-sequencing of pluripotent states unlocks modular transcriptional variation. Cell Stem Cell 17(4), 471–85.
1 2 3 4 5 6 7 8 9 10 11 12
# Mocking up some data ngenes <- 1000 ncells <- 100 gene.means <- 2^runif(ngenes, 0, 10) dispersions <- 1/gene.means + 0.2 counts <- matrix(rnbinom(ngenes*ncells, mu=gene.means, size=1/dispersions), nrow=ngenes) # Computing the DM. means <- rowMeans(counts) cv2 <- apply(counts, 1, var)/means^2 dm.stat <- DM(means, cv2) head(dm.stat)
Loading required package: BiocParallel Loading required package: scater Loading required package: Biobase Loading required package: BiocGenerics Loading required package: parallel Attaching package: 'BiocGenerics' The following objects are masked from 'package:parallel': clusterApply, clusterApplyLB, clusterCall, clusterEvalQ, clusterExport, clusterMap, parApply, parCapply, parLapply, parLapplyLB, parRapply, parSapply, parSapplyLB The following objects are masked from 'package:stats': IQR, mad, sd, var, xtabs The following objects are masked from 'package:base': Filter, Find, Map, Position, Reduce, anyDuplicated, append, as.data.frame, cbind, colMeans, colSums, colnames, do.call, duplicated, eval, evalq, get, grep, grepl, intersect, is.unsorted, lapply, lengths, mapply, match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, rank, rbind, rowMeans, rowSums, rownames, sapply, setdiff, sort, table, tapply, union, unique, unsplit, which, which.max, which.min Welcome to Bioconductor Vignettes contain introductory material; view with 'browseVignettes()'. To cite Bioconductor, see 'citation("Biobase")', and for packages 'citation("pkgname")'. Loading required package: ggplot2 Attaching package: 'scater' The following object is masked from 'package:stats': filter Warning message: In zoo(x = y, order.by = x) : some methods for "zoo" objects do not work if the index entries in 'order.by' are not unique  -0.05065798 -0.06376366 -0.02115419 0.03757197 0.11239291 -0.01195606
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.