all_times <- list() # store the time for each chunk knitr::knit_hooks$set(time_it = local({ now <- NULL function(before, options) { if (before) { now <<- Sys.time() } else { res <- difftime(Sys.time(), now, units = "secs") all_times[[options$label]] <<- res } } })) knitr::opts_chunk$set( tidy = TRUE, tidy.opts = list(width.cutoff = 95), message = FALSE, warning = FALSE, time_it = TRUE )
One of the first steps in all scRNA-seq analyses is performing a number of QC checks and plots so that data can be appropriately filtered. scCustomize contains a number of functions that can be used to quickly and easily generate some of the most relevant QC plots.
For details on functions for adding QC metrics and details about them please see Object QC Vignette.
For this tutorial, I will be utilizing HCA bone marrow cell data from the SeuratData package.
library(ggplot2) library(dplyr) library(magrittr) library(patchwork) library(Seurat) library(scCustomize) library(qs) # Load Example Dataset hca_bm <- hcabm40k.SeuratData::hcabm40k # Add pseudo group variable just for this vignette hca_bm@meta.data$group[hca_bm@meta.data$orig.ident == "MantonBM1" | hca_bm@meta.data$orig.ident == "MantonBM2" | hca_bm@meta.data$orig.ident == "MantonBM3" | hca_bm@meta.data$orig.ident == "MantonBM4"] <- "Group 1" hca_bm@meta.data$group[hca_bm@meta.data$orig.ident == "MantonBM5" | hca_bm@meta.data$orig.ident == "MantonBM6" | hca_bm@meta.data$orig.ident == "MantonBM7" | hca_bm@meta.data$orig.ident == "MantonBM8"] <- "Group 2"
hca_bm <- UpdateSeuratObject(hca_bm)
hca_bm <- Add_Cell_QC_Metrics(object = hca_bm, species = "human")
scCustomize has a number of quick QC plotting options for ease of use.
NOTE: Most scCustomize plotting functions contain ...
parameter to allow user to supply any of the parameters for the original Seurat function that is being used under the hood.
VlnPlot
-Based QC PlotsscCustomize contains a number of shortcut/wrapper functions for QC plotting, which are wrappers around Seurat::VlnPlot()
/VlnPlot_scCustom
.
QC_Plots_Genes()
Plots genes/features per cell/nucleus.QC_Plots_UMIs()
Plots UMIs per cell/nucleus. QC_Plots_Mito()
Plots mito% (named "percent_mito") per cell/nucleus.QC_Plots_Complexity()
Plots cell complexity metric (log10GenesPerUMI) per cell/nucleus.QC_Plots_Feature()
Plots "feature" per cell/nucleus. Using parameter feature
to allow plotting of any applicable named feature in object\@meta.data slot. QC_Plots_Combined_Vln()
Returns patchwork plot of QC_Plots_Genes()
, QC_Plots_UMIs()
, & QC_Plots_Mito()
.scCustomize functions have the added benefit of:
QC_Plots_Feature
).# All functions contain p1 <- QC_Plots_Genes(seurat_object = hca_bm, low_cutoff = 600, high_cutoff = 5500) p2 <- QC_Plots_UMIs(seurat_object = hca_bm, low_cutoff = 1200, high_cutoff = 45000) p3 <- QC_Plots_Mito(seurat_object = hca_bm, high_cutoff = 20) p4 <- QC_Plots_Complexity(seurat_object = hca_bm, high_cutoff = 0.8)
# All functions contain p1 <- QC_Plots_Genes(seurat_object = hca_bm, low_cutoff = 600, high_cutoff = 5500, pt.size = 0.1) p2 <- QC_Plots_UMIs(seurat_object = hca_bm, low_cutoff = 1200, high_cutoff = 45000, pt.size = 0.1) p3 <- QC_Plots_Mito(seurat_object = hca_bm, high_cutoff = 20, pt.size = 0.1) p4 <- QC_Plots_Complexity(seurat_object = hca_bm, high_cutoff = 0.8, pt.size = 0.1)
wrap_plots(p1, p2, p3, p4, ncol = 4)
In addition to being able to supply Seurat parameters with ...
these plots like many others in scCustomize contain other additional parameters to customize plot output without need for post-plot ggplot2 modifications
plot_title
: Change plot titlex_axis_label
/y_axis_label
: Change axis labels.x_lab_rotate
: Should x-axis label be rotated 45 degrees?y_axis_log
: Should y-axis in linear or log10 scale. plot_median
& median_size
: Plot a line at the median of each x-axis identity. plot_boxplot
: Plot boxplot on top of the violin.p1 <- QC_Plots_UMIs(seurat_object = hca_bm, low_cutoff = 1200, high_cutoff = 45000, pt.size = 0.1) p2 <- QC_Plots_UMIs(seurat_object = hca_bm, low_cutoff = 1200, high_cutoff = 45000, pt.size = 0.1, y_axis_log = TRUE) wrap_plots(p1, p2, ncol = 2)
p1 <- QC_Plots_UMIs(seurat_object = hca_bm, low_cutoff = 1200, high_cutoff = 45000, pt.size = 0, plot_median = TRUE) p2 <- QC_Plots_UMIs(seurat_object = hca_bm, low_cutoff = 1200, high_cutoff = 45000, pt.size = 0, y_axis_log = TRUE, plot_median = TRUE) wrap_plots(p1, p2, ncol = 2)
p1 <- QC_Plots_UMIs(seurat_object = hca_bm, low_cutoff = 1200, high_cutoff = 45000, pt.size = 0, plot_boxplot = TRUE) p2 <- QC_Plots_UMIs(seurat_object = hca_bm, low_cutoff = 1200, high_cutoff = 45000, pt.size = 0, y_axis_log = TRUE, plot_boxplot = TRUE) wrap_plots(p1, p2, ncol = 2)
As a shortcut you can return single patchwork plot of the 3 main QC Plots (Genes, UMIs, %Mito) by using single function, QC_Plots_Combined_Vln()
.
QC_Plots_Combined_Vln(seurat_object = hca_bm, feature_cutoffs = c(600, 5500), UMI_cutoffs = c(1200, 45000), mito_cutoffs = 20, pt.size = 0.1)
FeatureScatter
-Based QC PlotsscCustomize contains 3 functions which wrap Seurat::FeatureScatter()
with added visualization of potential cutoff thresholds and some additional functionality:
QC_Plot_UMIvsGene()
Plots genes vs UMIs per cell/nucleusQC_Plot_GenevsFeature()
Plots Genes vs. "feature" per cell/nucleus. Using parameter feature1
to allow plotting of any applicable named feature in object\@meta.data slot. QC_Plot_UMIvsFeature()
Plots UMIs vs. "feature" per cell/nucleus. Using parameter feature1
to allow plotting of any applicable named feature in object\@meta.data slot. shuffle = TRUE
by default to prevent hiding of datasetsQC_Plot_UMIvsGene
(based on values provided to high and low cutoff parameters)# All functions contain QC_Plot_UMIvsGene(seurat_object = hca_bm, low_cutoff_gene = 600, high_cutoff_gene = 5500, low_cutoff_UMI = 500, high_cutoff_UMI = 50000) QC_Plot_GenevsFeature(seurat_object = hca_bm, feature1 = "percent_mito", low_cutoff_gene = 600, high_cutoff_gene = 5500, high_cutoff_feature = 20)
# All functions contain p1 <- QC_Plot_UMIvsGene(seurat_object = hca_bm, low_cutoff_gene = 600, high_cutoff_gene = 5500, low_cutoff_UMI = 1200, high_cutoff_UMI = 45000, x_axis_label = "UMIs per Cell") p2 <- QC_Plot_GenevsFeature(seurat_object = hca_bm, feature1 = "percent_mito", low_cutoff_gene = 600, high_cutoff_gene = 5500, high_cutoff_feature = 20) wrap_plots(p1, p2, ncol = 2)
QC_Plot_UMIvsGene
contains the ability to color points by continuous meta data variables.
This can be used to plot % of mito reads in addition to UMI vs. Gene comparisons
QC_Plot_UMIvsGene(seurat_object = hca_bm, meta_gradient_name = "percent_mito", low_cutoff_gene = 600, high_cutoff_gene = 5500, high_cutoff_UMI = 45000) QC_Plot_UMIvsGene(seurat_object = hca_bm, meta_gradient_name = "percent_mito", low_cutoff_gene = 600, high_cutoff_gene = 5500, high_cutoff_UMI = 45000, meta_gradient_low_cutoff = 20)
p1 <- QC_Plot_UMIvsGene(seurat_object = hca_bm, meta_gradient_name = "percent_mito", low_cutoff_gene = 600, high_cutoff_gene = 5500, high_cutoff_UMI = 45000) p2 <- QC_Plot_UMIvsGene(seurat_object = hca_bm, meta_gradient_name = "percent_mito", low_cutoff_gene = 600, high_cutoff_gene = 5500, high_cutoff_UMI = 45000, meta_gradient_low_cutoff = 20) wrap_plots(p1, p2, ncol = 2)
If you are interested in viewing QC_Plot_UMIvsGene
both by discrete grouping variable and by continuous variable without writing function twice you can use combination = TRUE
and plot output will contain both plots.
QC_Plot_UMIvsGene(seurat_object = hca_bm, meta_gradient_name = "percent_mito", low_cutoff_gene = 600, high_cutoff_gene = 5500, high_cutoff_UMI = 45000, meta_gradient_low_cutoff = 20, combination = TRUE)
Finally, scCustomize contains a function to plot QC metrics as histogram instead of violin for visualization of the distribution of feature.
QC_Histogram(seurat_object = hca_bm, features = "percent_mito", low_cutoff = 15)
QC_Histogram(seurat_object = hca_bm, features = "percent_mito", low_cutoff = 15, split.by = "group")
scCustomize also contains a few helpful functions for returning and plotting the median values for these metrics on per sample/library basis.
scCustomize contains function Median_Stats
to quickly calculate the medians for basic QC stats (Genes/, UMIs/, %Mito/Cell, etc) and return a data.frame.
median_stats <- Median_Stats(seurat_object = hca_bm, group_by_var = "orig.ident")
median_stats %>% kableExtra::kbl() %>% kableExtra::kable_styling(bootstrap_options = c("bordered", "condensed", "responsive", "striped"))
The Median_Stats
function has some column names stored by default but will also calculate medians for additional meta.data columns using the optional median_var
parameter
median_stats <- Median_Stats(seurat_object = hca_bm, group_by_var = "orig.ident", median_var = "meta_data_column_name")
scCustomize also contains a few functions to plot some of these median value calculations, which can be used on their own without need to return data.frame first.
Plot_Median_Genes()
Plot_Median_UMIs()
Plot_Median_Mito()
Plot_Median_Other()
Plot_Median_Genes(seurat_object = hca_bm, group_by = "group") Plot_Median_UMIs(seurat_object = hca_bm, group_by = "group") Plot_Median_Mito(seurat_object = hca_bm, group_by = "group") Plot_Median_Other(seurat_object = hca_bm, median_var = "percent_ribo", group_by = "group")
p1 <- Plot_Median_Genes(seurat_object = hca_bm, group_by = "group") p2 <- Plot_Median_UMIs(seurat_object = hca_bm, group_by = "group") p3 <- Plot_Median_Mito(seurat_object = hca_bm, group_by = "group") p4 <- Plot_Median_Other(seurat_object = hca_bm, median_var = "percent_ribo", group_by = "group") wrap_plots(p1, p2, p3, p4, ncol = 2)
scCustomize also contains plotting function to plot the number of cells or nuclei per sample.
Since the HCA Bone Marrow dataset has exactly the same number of cells per sample we will use the microglia object from the Analysis Plots vignette.
marsh_mouse_micro <- qread(file = "assets/marsh_2020_micro.qs")
Plot_Cells_per_Sample(seurat_object = marsh_mouse_micro, group_by = "Transcription_Method")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.