Remove debris-contaminated droplets from single-cell based data.
Currently, we only support installation of the diem
R package
through devtools
library(devtools)
devtools::install_github("marcalva/diem")
Check out the vignette for a thorough tutorial on using diem
.
Shown below is a quick workflow for reading 10X data, filtering droplets using default parameters, and converting to a seurat object. Note you need Seurat installed to run the last step.
library(diem)
library(Seurat)
counts <- read_10x("path/to/10x") # Read 10X data into sparse matrix
sce <- create_SCE(counts) # Create SCE object from counts
# Add MT% and MALAT1%
mt_genes <- grep(pattern="^mt-", x=rownames(sce@gene_data), ignore.case=TRUE, value=TRUE)
sce <- get_gene_pct(x = sce, genes=mt_genes, name="pct.mt")
malat <- grep(pattern="^malat1$", x=rownames(sce@gene_data), ignore.case=TRUE, value=TRUE)
sce <- get_gene_pct(x = sce, genes=malat, name="MALAT1")
# Plot total counts ranked
barcode_rank_plot(sce)
# DIEM steps
sce <- set_debris_test_set(sce)
sce <- filter_genes(sce)
sce <- get_pcs(sce)
sce <- init(sce)
sce <- run_em(sce)
sce <- assign_clusters(sce)
sce <- estimate_dbr_score(sce)
# Evaluate debris scores
sm <- summarize_clusters(sce)
plot_clust(sce, feat_x = "n_genes", feat_y = "score.debris",
log_x = TRUE, log_y = FALSE)
plot_clust(sce, feat_x = "pct.mt", feat_y = "score.debris",
log_x = TRUE, log_y = FALSE)
# Call targets using debris score for single-nucleus data
sce <- call_targets(sce, thresh_score = 0.5)
# Call targets by removing droplets in debris cluster(s) for single-cell data
sce <- call_targets(sce, clusters = "debris", thresh = NULL)
seur <- convert_to_seurat(sce)
December 19, 2022 * version 2.4.1 * Fixed an emergent bug that dropped barcode IDs when calling get_pcs and normalizing counts. * Fixed a bug in vec_cmplmnt that could have caused invalid indexing. Added further checks in index values in MultEM.cpp. * Fixed possible bugs in test functions introduced in v2.4.0
February 13, 2022 * version 2.4.0 * Added a prior to reduce overfitting
April 13, 2020 * version 2.3.0 * Quantifies amount of contamination in droplets. Filtering is performed using this debris score. * Clustering switched to multinomial mixture model to increase speed.
February 25, 2020 * version 2.2.0 * Initialize alpha with method of moments instead of optimize
February 19, 2020 * version 2.1.0 * Additional function for extracting Alpha parameters for use with DE * Run multiple k_init values at the same time * Multi-threading * More efficient memory storage of objects
February 18, 2020 * version 2.0.1 * Patch fixes some installation issues with tests and docs
February 2, 2020 * version 2.0.0 * Uses Dirichlet-multinomial * Initializes with k-means * Removes background centers using a likelihood strategy
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.