title: "Visualization Techniques for scRNAseq Analysis with Seurat.utils" subtitle: "!WORK IN PROGRESS! - Demonstration of the Visualization section of the Seurat.utils package." author: "Richard and Abel" date: "2024-10-29" knit_root_dir: ~/Documents/Analysis_R/Vignette_SU/ output: html_document: keep_md: true # Optional, keeps .md output for debugging pdf_document: includes: in_header: preamble.tex
Seurat.utils
packageSeurat.utils
enhances scRNAseq analysis by building on the Seurat
package, and introducing advanced and convenient tools for typical analytic needs.
This vignette focuses on the Seurat.Utils.Visualization.R
component, demonstrating how it streamlines and enriches the exploration of single-cell RNA sequencing data.
I wanted to demonstrate Seurat.utils
on a real, published Seurat object. Therefore, I chose a an integrated object from our previous publication, Gruffi.
This object created the time before Seurat v5 existed. I used SeuratObject::UpdateSeuratObject()
to update the object to v5.During the creation of this vignettes, we realized that this function actually creates a "hybrid" object between v5 and v3, where only certain assays, etc. are updated. I contacted the authors, but they declared this is a feature not bug.
As I did not want to write UpdateSeuratObjectThoroughly()
, so I was stuck with this mutant v3-v5 object, and therefore I had to hack some of the
functions which would otherwise work perfectly on a clean v5 object. You will see signs of this below.
Apologies to the readers.
While Seurat
offers comprehensive tools for scRNAseq data analysis, Seurat.utils
extends these functions with specialized or more convenient tools.
Examples: - Custom UMAP Plots: Automatic file saving (png/pdf/jpeg), automatic annotation, better defaults, and more. - PCA Variance Explained: Tools for a more detailed examination of variance explained by principal components, aiding in the interpretation of dimensionality reduction results.
If Seurat.utils
is not installed, installation instructions are available at github.com/vertesy/Seurat.utils.
Before diving into data analysis, ensure that Seurat.utils
is installed and loaded alongside MarkdownReports
for comprehensive reporting capabilities.
rm(list = ls(all.names = TRUE)); try(dev.off(), silent = T); gc()
## null device
## 1
## used (Mb) gc trigger (Mb) limit (Mb) max used (Mb)
## Ncells 901636 48.2 1758464 94 NA 1280591 68.4
## Vcells 1513305 11.6 8388608 64 102400 2523139 19.3
library(dplyr)
# library(png)
require("Stringendo")
require("CodeAndRoll2")
require("MarkdownHelpers")
require("MarkdownReports")
require("ggExpress")
# require("Seurat.utils")
source('~/GitHub/Packages/Rocinante/R/Rocinante.R')
## [1] "Loading Rocinante custom function library."
## [1] "Depends on CodeAndRoll2, MarkdownReports, gtools, readr, gdata, clipr. Some functions depend on other libraries."
r$Seurat.utils()
OutDir <- "~/Documents/Analysis_R/Vignette_SU/"
# OutDir <- "~/GitHub/Zacc/Vignette.seurat.utils"
setwd(OutDir)
# combined.obj <- xread("C:\\gruffi\\obj.Fig.4C.clean_8200.cells_Vignette.seurat.utils_2024.03.27_10.15.qs")
# combined.obj <- read("~/Downloads/obj_8200.cells_Fig.4C.clean_Vignette.seurat.utils_2024.06.03_21.00.qs")
# obj.small <- downsampleSeuObj(combined.obj, nCells = 2500)
# xsave(obj.small, v = T, showMemObject = F)
combined.obj <- xread('~/Documents/Analysis_R/Vignette_SU/obj.small_2500.cells_Vignette.seurat.utils_2024.10.25_13.46.qs')
## MALAT1 TMSB4X MT-CO1 TUBA1A TMSB10 RPS19
## 1.0000000 0.9999625 0.9999251 0.9998876 0.9998501 0.9998127
## [1] "Seurat with 2500 cells & 41 meta colums."
## xread: 1.869 sec elapsed
# cc.genes <- list(
# s.genes = c("MCM5", "PCNA", "TYMS", "FEN1", "MCM2", "MCM4",
# "RRM1", "UNG", "GINS2", "MCM6", "CDCA7", "DTL", "PRIM1", "UHRF1",
# "MLF1IP", "HELLS", "RFC2", "RPA2", "NASP", "RAD51AP1", "GMNN",
# "WDR76", "SLBP", "CCNE2", "UBR7", "POLD3", "MSH2", "ATAD2", "RAD51",
# "RRM2", "CDC45", "CDC6", "EXO1", "TIPIN", "DSCC1", "BLM", "CASP8AP2",
# "USP1", "CLSPN", "POLA1", "CHAF1B", "BRIP1", "E2F8"),
# g2m.genes = c("HMGB2",
# "CDK1", "NUSAP1", "UBE2C", "BIRC5", "TPX2", "TOP2A", "NDC80",
# "CKS2", "NUF2", "CKS1B", "MKI67", "TMPO", "CENPF", "TACC3", "FAM64A",
# "SMC4", "CCNB2", "CKAP2L", "CKAP2", "AURKB", "BUB1", "KIF11",
# "ANP32E", "TUBB4B", "GTSE1", "KIF20B", "HJURP", "CDCA3", "HN1",
# "CDC20", "TTK", "CDC25C", "KIF2C", "RANGAP1", "NCAPD2", "DLGAP5",
# "CDCA2", "CDCA8", "ECT2", "KIF23", "HMMR", "AURKA", "PSRC1",
# "ANLN", "LBR", "CKAP5", "CENPE", "CTCF", "NEK2", "G2E3", "GAS2L3",
# "CBX5", "CENPA")
# )
# combined.obj <- CellCycleScoring(combined.obj, s.features = cc.genes$'s.genes', g2m.features = cc.genes$'g2m.genes')
(identX <- GetClusteringRuns(combined.obj)[1])
## c("integrated_snn_res.0.1", "integrated_snn_res.0.2", "integrated_snn_res.0.3",
## "integrated_snn_res.0.4", "integrated_snn_res.0.5")
## [1] "integrated_snn_res.0.1"
(ident2 <- GetNamedClusteringRuns(combined.obj)[2])
## c("cl.names.top.gene.res.0.2", "cl.names.KnownMarkers.0.2", "cl.names.top.gene.res.0.5",
## "cl.names.KnownMarkers.0.5")
## [1] "cl.names.KnownMarkers.0.2"
raster <- if (ncol(combined.obj) > 1e+05) TRUE else FALSE
nr.Col <- 2
nr.Row <- 4
wA4 <- 8.27
hA4 <- 11.69
list.of.genes <- c("MALAT1", "TMSB4X", "MT-CO1")
ls.Seu <- list("Exp1" = combined.obj,
"Exp2" = downsampleSeuObj(combined.obj, fractionCells = .2),
"Exp3" = downsampleSeuObj(combined.obj, fractionCells = .3))
## [1] "500 or 20% of the cells are kept. Seed: 1989"
## [1] "750 or 30% of the cells are kept. Seed: 1989"
A quick summary to understand dataset dimensions and composition:
stopifnot(exists("ident2"))
combined.obj
## An object of class Seurat
## 28690 features across 2500 samples within 2 assays
## Active assay: integrated (2000 features, 2000 variable features)
## 2 layers present: data, scale.data
## 1 other assay present: RNA
## 2 dimensional reductions calculated: pca, umap
scBarplot.CellFractions(obj = combined.obj, group.by = ident2, fill.by = "Phase")
##
## G1 G2M S
## 0.MKI67 0 137 89
## 1.ID2 224 53 75
## 10.OPCML 162 9 37
## 2.HES6 111 53 74
## 3.MAF 15 4 3
## 4.BNIP3 163 52 97
## 5.MEIS2 251 34 104
## 6.POLR2A 103 34 72
## 7.ERBB4 141 48 114
## 8.MEF2C 139 23 23
## 9.ZFHX3 40 1 15
## [1] "/Users/abel.vertesy/Documents/Analysis_R/Vignette_SU/Cell.proportions.of.Phase.by.cl.names.KnownMarkers.0.2.fr.barplot.png"
## 1.035 sec elapsed
# knitr::include_graphics("Cell.proportions.of.Phase.by.cl.names.top.gene.res.0.2.downsampled.fr.barplot.png")
SetupReductionsNtoKdimensions()
SetupReductionsNtoKdimensions is a function that calculates N-to-K dimensional UMAPs, executing specified dimensionality reduction (UMAP, tSNE, or PCA) over a range of dimensions and backing up the results within a Seurat object.
# combined.obj <- SetupReductionsNtoKdimensions(
# obj = combined.obj,
# nPCs = 30,
# dimensions = 3:2,
# reduction_input = "pca",
# reduction_output = "umap"
# )
BackupReduction()
BackupReduction
stores a backup of specified dimensionality reduction data (e.g., UMAP, tSNE, PCA) within the Seurat object.
combined.obj <- BackupReduction(obj = combined.obj, dim = 2, reduction = "umap")
RecallReduction()
RecallReduction
restores dimensionality reduction data (e.g., UMAP, tSNE, PCA) from a backup stored within obj@misc$reductions.backup to the active obj@reductions slot.
combined.obj <- RecallReduction(obj = combined.obj, dim = 2, reduction = "umap")
## [1] "2 dimensional umap from obj@misc$reductions.backup is set active. "
qUMAP()
qUMAP
is a wrapper function for Seurat::FeaturePlot
that allows for quick visualization of UMAPs colored by a numeric features (metadata columns) or genes.
qUMAP(feature = "nFeature_RNA")
## [1] "/Users/abel.vertesy/Documents/Analysis_R/Vignette_SU/UMAP.nFeature_RNA.RNA.2500c.png"
qUMAP(feature = "TOP2A", PNG = FALSE, save.plot = TRUE)
## [1] "/Users/abel.vertesy/Documents/Analysis_R/Vignette_SU/UMAP.TOP2A.RNA.2500c.pdf"
clUMAP()
clUMAP
is a wrapper function for Seurat::DimPlot
that allows for quick visualization of UMAPs colored by categorical features from metadata columns, e.g clustering results.
clUMAP(ident = "integrated_snn_res.0.1", cols = RColorBrewer::brewer.pal(7, "Dark2"))
## [1] "/Users/abel.vertesy/Documents/Analysis_R/Vignette_SU/UMAP.integrated_snn_res.0.1.2500c.png"
## 0.918 sec elapsed
FlipReductionCoordinates()
FlipReductionCoordinates
reverses the dimensionality reduction coordinates (such as UMAP or tSNE) vertically or horizontally to alter the visualization perspective.
clUMAP(obj = combined.obj)
## [1] "/Users/abel.vertesy/Documents/Analysis_R/Vignette_SU/UMAP.cl.names.top.gene.res.0.2.2500c.png"
## 0.964 sec elapsed
combined.obj <- FlipReductionCoordinates(
obj = combined.obj,
dim = 2,
reduction = "umap",
flip = c("x", "y", "xy", NULL)[1],
FlipReductionBackupToo = FALSE # If you want to keep backup of the original coordinates, set to FALSE
)
clUMAP(obj = combined.obj, sub = "flipped x axis in the UMAP coordinates")
## [1] "/Users/abel.vertesy/Documents/Analysis_R/Vignette_SU/UMAP.cl.names.top.gene.res.0.2.2500c.flipped.x.axis.in.the.UMAP.coordinates.png"
## 0.941 sec elapsed
AutoNumber.by.UMAP()
AutoNumber.by.UMAP
automatically renumbers clusters based on their position along a specified dimension in a UMAP (or tSNE or PCA) plot, potentially enhancing interpretability by ordering clusters.
combined.obj <- AutoNumber.by.UMAP(
obj = combined.obj, reduction = "umap",
dim = 1, swap = TRUE, # Swap the order along dimension 1, to get a pseudotime-like ordering from progenitor to differentiated cells
ident = ident2,
obj.version = 3, # You will not needed unless updating via `SeuratObject::UpdateSeuratObject()`.
plot = TRUE
)
## [1] "NewMetaCol: cl.names.KnownMarkers.0.2.ordered"
## [1] "/Users/abel.vertesy/Documents/Analysis_R/Vignette_SU/UMAP.cl.names.KnownMarkers.0.2.ordered.2500c.png"
## 0.911 sec elapsed
multiSingleClusterHighlightPlots.A4()
multiSingleClusterHighlightPlots.A4
generates and saves cluster highlight plots for both single and multiple clusters using UMAP or other dimensionality reduction techniques, supporting various plot formats and customization options, and ensuring an A4 paper size for the output.
# scBarplot.CellsPerCluster(ident = identX, obj = combined.obj)
r$Seurat.utils()
multiSingleClusterHighlightPlots.A4(ident = identX, obj = combined.obj)
## [1] "integrated_snn_res.0.1-umap/"
## [1] "All files will be saved under 'NewOutDir': /Users/abel.vertesy/Documents/Analysis_R/Vignette_SU/integrated_snn_res.0.1-umap/"
## [1] "ParentDir will be:"
## [1] "ParentDir defined as:"
## [1] "Call *create_set_Original_OutDir()* when chaning back to the main dir."
## [1] "OutDir defined as:"
## [1] "b.Subdirname defined as:"
## [1] "page: 1 | clusters 0, 1, 2, 3"
## 0.05 sec elapsed
## 0.052 sec elapsed
## 0.048 sec elapsed
## 0.047 sec elapsed
## [1] "page: 2 | clusters 4"
## 0.049 sec elapsed
## [1] "All files will be saved under 'OutDir':"
## [1] "OutDir defined as:"
## 2.124 sec elapsed
qClusteringUMAPS()
qClusteringUMAPS
generates and arranges UMAP plots for up to four specified clustering resolutions from a Seurat object onto an A4 page, facilitating comparative visualization.
ident2 <- na.omit.strip(GetClusteringRuns(combined.obj)[1:2])
## c("integrated_snn_res.0.1", "integrated_snn_res.0.2", "integrated_snn_res.0.3",
## "integrated_snn_res.0.4", "integrated_snn_res.0.5")
qClusteringUMAPS( obj = combined.obj, idents = ident2)
## 0.387 sec elapsed
## 0.094 sec elapsed
## c("integrated_snn_res.0.1", "integrated_snn_res.0.2", "integrated_snn_res.0.3",
## "integrated_snn_res.0.4", "integrated_snn_res.0.5")
## [1] "Identity not found. Plotting integrated_snn_res.0.1 \n"
## 0.085 sec elapsed
## c("integrated_snn_res.0.1", "integrated_snn_res.0.2", "integrated_snn_res.0.3",
## "integrated_snn_res.0.4", "integrated_snn_res.0.5")
## [1] "Identity not found. Plotting integrated_snn_res.0.1 \n"
## 0.089 sec elapsed
## [1] "/Users/abel.vertesy/Documents/Analysis_R/Vignette_SU/Clustering.UMAP.Res_0.1_0.2.png"
plotQUMAPsInAFolder()
plotQUMAPsInAFolder
plots qUMAPs for a specified set of genes, storing the results in a specified folder, with the option to default to using the gene set name if no folder name is provided.
plotQUMAPsInAFolder(
genes = c("MT-CO1", "TUBA1A"), obj = combined.obj,
foldername = "MyGenePlots", intersectionAssay = "RNA",
plot.reduction = "umap"
)
## [1] "MyGenePlots-umap/"
## [1] "All files will be saved under 'NewOutDir': /Users/abel.vertesy/Documents/Analysis_R/Vignette_SU/MyGenePlots-umap/"
## [1] "ParentDir was defined as:"
## [1] "ParentDir will be:"
## [1] "ParentDir defined as:"
## [1] "Call *create_set_Original_OutDir()* when chaning back to the main dir."
## [1] "OutDir defined as:"
## [1] "b.Subdirname defined as:"
## check.genes: 0.007 sec elapsed
## [1] "/Users/abel.vertesy/Documents/Analysis_R/Vignette_SU/MyGenePlots-umap/UMAP.MT-CO1.RNA.2500c.png"
## [1] "/Users/abel.vertesy/Documents/Analysis_R/Vignette_SU/MyGenePlots-umap/UMAP.TUBA1A.RNA.2500c.png"
## [1] "All files will be saved under 'OutDir':"
## [1] "OutDir defined as:"
"Creates individual plots in */MyGenePlots/*.png"
## [1] "Creates individual plots in */MyGenePlots/*.png"
umapHiLightSel()
The umapHiLightSel
function generates a UMAP plot from a Seurat object, highlighting specified clusters, and saves the resulting plot directly to the current working directory.
umapHiLightSel(
obj = combined.obj,
COI = c("0", "2", "4"),
ident = GetClusteringRuns()[1]
)
## c("integrated_snn_res.0.1", "integrated_snn_res.0.2", "integrated_snn_res.0.3",
## "integrated_snn_res.0.4", "integrated_snn_res.0.5")
## [1] "1514 cells found."
## [1] "cells.0.2.4"
scBarplot.FractionAboveThr()
scBarplot.FractionAboveThr
draws a barplot of the fraction of cells above a certain threshold for a given feature.
scBarplot.FractionAboveThr(id.col = identX, value.col = "percent.ribo", thrX = 0.1)
## # A tibble: 5 × 3
## value names colour
## <dbl> <chr> <lgl>
## 1 47.2 0 TRUE
## 2 37.5 1 FALSE
## 3 64.6 2 TRUE
## 4 32.5 3 FALSE
## 5 59.1 4 TRUE
## [1] "/Users/abel.vertesy/Documents/Analysis_R/Vignette_SU/Pc.cells.above.percent.ribo.of.0.1.integrated_snn_res.0.1.png"
## 0.371 sec elapsed
scBarplot.FractionBelowThr()
scBarplot.FractionBelowThr
generates a bar plot to visualize the percentage of cells within each cluster that fall below a specified threshold, according to a metadata column value.
scBarplot.FractionBelowThr(id.col = identX, value.col = "percent.ribo", thrX = 0.1)
## # A tibble: 5 × 3
## value names colour
## <dbl> <chr> <lgl>
## 1 52.8 0 FALSE
## 2 62.5 1 TRUE
## 3 35.4 2 FALSE
## 4 67.5 3 TRUE
## 5 40.9 4 FALSE
## [1] "/Users/abel.vertesy/Documents/Analysis_R/Vignette_SU/Pc.cells.below.percent.ribo.of.0.1.integrated_snn_res.0.1.png"
## 0.322 sec elapsed
PercentInTranscriptome()
PercentInTranscriptome
computes and visualizes gene expression levels relative to total UMI counts, emphasizing highly expressed genes' contribution to the transcriptome. The first argument of the readPNG()
function is the image path. Additionally, you can provide only the file name if the working directory is set to the folder containing the image.
PercentInTranscriptome(combined.obj, n.genes.barplot = 25)
## FTH1 MT-CO1 MT-CO3 GAPDH FTL MT-CO2 MT-ND4 FABP7 MT-CYB C1orf61
## 0.758 0.746 0.605 0.580 0.577 0.570 0.564 0.476 0.470 0.452
## MT-ATP6 TUBA1B TPI1 STMN2 CKB HMGB1 JUN H2AFZ MT-ND2 SOX11
## 0.451 0.443 0.422 0.400 0.391 0.387 0.372 0.363 0.357 0.355
## SERF2 PKM ENO1 STMN4 DDIT4
## 0.354 0.352 0.351 0.336 0.336
## [1] "/Users/abel.vertesy/Documents/Analysis_R/Vignette_SU/Gene.expression.as.fraction.of.all.transcripts.UMI.s.logY.hist.png"
## 0.368 sec elapsed
## # A tibble: 25 × 3
## value names colour
## <dbl> <chr> <chr>
## 1 0.758 FTH1 1
## 2 0.746 MT-CO1 1
## 3 0.605 MT-CO3 1
## 4 0.58 GAPDH 1
## 5 0.577 FTL 1
## 6 0.57 MT-CO2 1
## 7 0.564 MT-ND4 1
## 8 0.476 FABP7 1
## 9 0.47 MT-CYB 1
## 10 0.452 C1orf61 1
## # ℹ 15 more rows
## [1] "/Users/abel.vertesy/Documents/Analysis_R/Vignette_SU/Percentage.of.highest.expressed.genes.bar.png"
## 0.413 sec elapsed
## An object of class Seurat
## 28690 features across 2500 samples within 2 assays
## Active assay: integrated (2000 features, 2000 variable features)
## 2 layers present: data, scale.data
## 1 other assay present: RNA
## 2 dimensional reductions calculated: pca, umap
# showing the generated plot
plotGeneExpressionInBackgroundHist()
plotGeneExpressionInBackgroundHist(gene = "HMGB2", obj = combined.obj)
## [1] "/Users/abel.vertesy/Documents/Analysis_R/Vignette_SU/HMGB2.and.the.normalised.logtransformed.transcript.count.distribution.logX.hist.png"
## 0.375 sec elapsed
scBarplot.FractionAboveThr()
scBarplot.FractionAboveThr generates a bar plot depicting the percentage of cells within each cluster that exceed a specified threshold, based on a selected metadata column.
scBarplot.FractionAboveThr(id.col = "cl.names.top.gene.res.0.2",
, value.col = "percent.ribo", thrX = 0.1)
## # A tibble: 11 × 3
## value names colour
## <dbl> <chr> <lgl>
## 1 62.8 0.HMGB2 TRUE
## 2 61.1 1.CLU TRUE
## 3 28.8 10.OPCML FALSE
## 4 71.0 2.HES6 TRUE
## 5 68.2 3.TTR TRUE
## 6 61.2 4.BNIP3 TRUE
## 7 31.9 5.CNTNAP2 FALSE
## 8 36.8 6.AL118516.1 FALSE
## 9 40.9 7.DLX6-AS1 FALSE
## 10 31.4 8.MEF2C FALSE
## 11 10.7 9.GRIA4 FALSE
## [1] "/Users/abel.vertesy/Documents/Analysis_R/Vignette_SU/Pc.cells.above.percent.ribo.of.0.1.cl.names.top.gene.res.0.2.png"
## 0.335 sec elapsed
# knitr::include_graphics("Pc.cells.above.percent.ribo.of.0.1.cl.names.top.gene.res.0.2.png")
scBarplot.FractionBelowThr()
scBarplot.FractionBelowThr
generates a bar plot to visualize the percentage of cells within each cluster that fall below a specified threshold, based on a selected metadata column value.
scBarplot.FractionBelowThr(id.col = "cl.names.top.gene.res.0.2", value.col = "percent.ribo", thrX = 0.1, palette_use ="npg" )
## # A tibble: 11 × 3
## value names colour
## <dbl> <chr> <lgl>
## 1 37.2 0.HMGB2 FALSE
## 2 38.9 1.CLU FALSE
## 3 71.2 10.OPCML TRUE
## 4 29.0 2.HES6 FALSE
## 5 31.8 3.TTR FALSE
## 6 38.8 4.BNIP3 FALSE
## 7 68.1 5.CNTNAP2 TRUE
## 8 63.2 6.AL118516.1 TRUE
## 9 59.1 7.DLX6-AS1 TRUE
## 10 68.6 8.MEF2C TRUE
## 11 89.3 9.GRIA4 TRUE
## [1] "/Users/abel.vertesy/Documents/Analysis_R/Vignette_SU/Pc.cells.below.percent.ribo.of.0.1.cl.names.top.gene.res.0.2.png"
## 0.364 sec elapsed
# knitr::include_graphics("Pc.cells.below.percent.ribo.of.0.1.cl.names.top.gene.res.0.2.png")
scBarplot.CellFractions()
scBarplot.CellFractions
generates a bar plot of cell fractions per cluster from a Seurat object, allowing downsampling, grouping by one variable, filling by another, custom color palettes, numerical value display on bars, and saving the plot.
# scBarplot.CellFractions(obj = combined.obj, group.by = ident2, fill.by = "Phase")
# knitr::include_graphics("Cell.proportions.of.Phase.by.cl.names.top.gene.res.0.2.downsampled.fr.barplot.png")
scBarplot.CellsPerCluster()
scBarplot.CellsPerCluster
generates a bar plot visualizing the fraction of cells within each cluster.
# ident2<- GetNamedClusteringRuns(combined.obj)[2]
scBarplot.CellsPerCluster(ident = ident2, sort = TRUE)
## 0# A tibble: 5 × 3
## value names colour
## <int> <chr> <chr>
## 1 203 4 1
## 2 228 3 2
## 3 483 2 3
## 4 758 1 4
## 5 828 0 5
## [1] "/Users/abel.vertesy/Documents/Analysis_R/Vignette_SU/Cells.per.Identity.Group.integrated_snn_res.0.1.integrated_snn_res.0.2.2500.c.bar.png"
## 0.305 sec elapsed
plotClustSizeDistr()
plotClustSizeDistr
generates a bar plot or histogram to visualize the size distribution of clusters within a Seurat object, based on the specified clustering identity.
plotClustSizeDistr(obj = combined.obj, ident=identX, plot = TRUE, thr.hist = 30)
##
## 0 1 2 3 4
## 828 758 483 228 203
## # A tibble: 5 × 3
## value names colour
## <int> <chr> <chr>
## 1 828 0 1
## 2 758 1 1
## 3 483 2 1
## 4 228 3 1
## 5 203 4 1
## [1] "/Users/abel.vertesy/Documents/Analysis_R/Vignette_SU/Cluster.sizes.at.integrated_snn_res.0.1.bar.png"
## 0.317 sec elapsed
plotGeneExpHist()
plotGeneExpHist
creates and optionally saves a histogram displaying expression levels of specified genes within a Seurat object, with features for aggregate gene expression, expression threshold filtering, and quantile clipping for count data.
scBarplot.CellsPerObject()
The scBarplot.CellsPerObject
function visualizes the number of cells in each Seurat object within a list, displaying the distribution of cell counts across different datasets or experimental conditions.
scBarplot.CellsPerObject(ls.Seu)
## # A tibble: 3 × 3
## value names colour
## <int> <chr> <chr>
## 1 2500 Exp1 1
## 2 500 Exp2 1
## 3 750 Exp3 1
## [1] "/Users/abel.vertesy/Documents/Analysis_R/Vignette_SU/Nr.Cells.After.Filtering.bar.png"
## 0.287 sec elapsed
scPlotPCAvarExplained()
Visualize the variance explained by PCA components, integrating seamlessly with MarkdownReports
for documentation.
scPlotPCAvarExplained(combined.obj, plotname = "Variance Explained by Principal Components")
## # A tibble: 50 × 3
## value names colour
## <dbl> <chr> <lgl>
## 1 9.33 1 TRUE
## 2 6.51 2 TRUE
## 3 5.87 3 TRUE
## 4 3.91 4 TRUE
## 5 3.62 5 TRUE
## 6 3.22 6 TRUE
## 7 3.01 7 TRUE
## 8 2.68 8 TRUE
## 9 2.59 9 TRUE
## 10 2.29 10 TRUE
## # ℹ 40 more rows
## [1] "/Users/abel.vertesy/Documents/Analysis_R/Vignette_SU/Variance.Explained.by.Principal.Components.bar.png"
## 0.448 sec elapsed
scCalcPCAVarExplained()
Calculate the variance explained by principal components, providing insights into the data structure and dimensionality reduction.
Used by scPlotPCAvarExplained()
.
var_explained <- scCalcPCAVarExplained(combined.obj)
print(var_explained)
## 1 2 3 4 5 6 7 8
## 9.332360 6.512810 5.872723 3.912707 3.615792 3.218808 3.007419 2.678719
## 9 10 11 12 13 14 15 16
## 2.585753 2.289343 2.249712 1.970911 1.947799 1.919924 1.816933 1.697605
## 17 18 19 20 21 22 23 24
## 1.614767 1.569853 1.537237 1.492872 1.492427 1.462241 1.410362 1.397870
## 25 26 27 28 29 30 31 32
## 1.394652 1.360058 1.339489 1.328720 1.308753 1.306202 1.302914 1.291201
## 33 34 35 36 37 38 39 40
## 1.285678 1.284520 1.279822 1.275696 1.274921 1.273328 1.270788 1.265798
## 41 42 43 44 45 46 47 48
## 1.264938 1.261772 1.259973 1.257276 1.255964 1.253753 1.252349 1.249852
## 49 50
## 1.247522 1.247114
qFeatureScatter()
qFeatureScatter
generates a scatter plot comparing two features (genes or metrics) from a Seurat object with optional logarithmic transformations and saving capabilities, wrapping around Seurat's FeatureScatter for enhanced usability.
qFeatureScatter(feature1 = "TOP2A", feature2 = "ID2", obj = combined.obj)
## [1] "/Users/abel.vertesy/Documents/Analysis_R/Vignette_SU/TOP2A.VS.ID2.png"
## 0.541 sec elapsed
suPlotVariableFeatures()
suPlotVariableFeatures
generates a Variable Feature Plot for a specified Seurat object, labeling points with the top 20 variable genes, and saves the plot to a PDF file.
multiFeaturePlot.A4()
The multiFeaturePlot.A4
function saves multiple FeaturePlots, each representing a gene from a list of gene names, as JPEG images on A4 size paper.
combined.obj@version <- package_version("3.0.0") # Because of `SeuratObject::UpdateSeuratObject()`...
multiFeaturePlot.A4(
list.of.genes,
obj = combined.obj,
foldername = substitute(list.of.genes),
plot.reduction = "umap",
intersectionAssay = c("RNA", "integrated")[1],
layout = c("tall", "wide", FALSE)[2],
colors = c("grey", "red"),
nr.Col = nr.Col,
nr.Row = nr.Row,
raster = raster,
cex = round(0.1/(nr.Col * nr.Row), digits = 2),
cex.min = if (raster) TRUE else FALSE,
gene.min.exp = "q01",
gene.max.exp = "q99",
subdir = TRUE,
prefix = NULL,
suffix = NULL,
background_col = "white",
aspect.ratio = c(FALSE, 0.6)[2],
saveGeneList = FALSE,
w = wA4,
h = hA4,
scaling = 1,
format = c("jpg", "pdf", "png")[1]
)
## [1] "list.of.genes-umap/"
## [1] "ParentDir defined as:"
## [1] "OutDir defined as:"
## [1] "b.Subdirname defined as:"
## check.genes: 0.007 sec elapsed
## [1] "layout active, nr.Col ignored."
## [1] "1 MALAT1 TMSB4X MT-CO1"
## [1] "OutDir defined as:"
## 1.333 sec elapsed
combined.obj@version <- package_version("5.0.1")
PlotTopGenesPerCluster()
The PlotTopGenesPerCluster
function visualizes the top N differentially expressed (DE) genes for each cluster within a specified clustering resolution of a Seurat object, aiding in the exploration of gene expression patterns across clusters.
PlotTopGenesPerCluster(obj = combined.obj, cl_res = 0.5, nrGenes = 2)
## [1] "TopGenes.umaps-umap/"
## [1] "ParentDir defined as:"
## [1] "OutDir defined as:"
## [1] "b.Subdirname defined as:"
## check.genes: 0.006 sec elapsed
## [1] "Old names contained duplicated elements 0"
## [1] "layout active, nr.Col ignored."
## [1] "1 HIST1H4C TUBA1B"
## [1] "OutDir defined as:"
## 0.939 sec elapsed
## [1] "TopGenes.umaps-umap/"
## [1] "ParentDir defined as:"
## [1] "OutDir defined as:"
## [1] "b.Subdirname defined as:"
## check.genes: 0.006 sec elapsed
## [1] "Old names contained duplicated elements 1"
## [1] "layout active, nr.Col ignored."
## [1] "1 UBE2C PTTG1"
## [1] "OutDir defined as:"
## 0.789 sec elapsed
## [1] "TopGenes.umaps-umap/"
## [1] "ParentDir defined as:"
## [1] "OutDir defined as:"
## [1] "b.Subdirname defined as:"
## check.genes: 0.006 sec elapsed
## [1] "Old names contained duplicated elements 10"
## [1] "layout active, nr.Col ignored."
## [1] "1 HS6ST3 RBFOX1"
## [1] "OutDir defined as:"
## 0.758 sec elapsed
## [1] "TopGenes.umaps-umap/"
## [1] "ParentDir defined as:"
## [1] "OutDir defined as:"
## [1] "b.Subdirname defined as:"
## check.genes: 0.006 sec elapsed
## [1] "Old names contained duplicated elements 11"
## [1] "layout active, nr.Col ignored."
## [1] "1 AL118516.1 POLR2A"
## [1] "OutDir defined as:"
## 0.842 sec elapsed
## [1] "TopGenes.umaps-umap/"
## [1] "ParentDir defined as:"
## [1] "OutDir defined as:"
## [1] "b.Subdirname defined as:"
## check.genes: 0.007 sec elapsed
## [1] "Old names contained duplicated elements 12"
## [1] "layout active, nr.Col ignored."
## [1] "1 DLX6-AS1 NRXN3"
## [1] "OutDir defined as:"
## 0.763 sec elapsed
## [1] "TopGenes.umaps-umap/"
## [1] "ParentDir defined as:"
## [1] "OutDir defined as:"
## [1] "b.Subdirname defined as:"
## check.genes: 0.007 sec elapsed
## [1] "Old names contained duplicated elements 13"
## [1] "layout active, nr.Col ignored."
## [1] "1 MEF2C NKAIN2"
## [1] "OutDir defined as:"
## 0.781 sec elapsed
## [1] "TopGenes.umaps-umap/"
## [1] "ParentDir defined as:"
## [1] "OutDir defined as:"
## [1] "b.Subdirname defined as:"
## check.genes: 0.006 sec elapsed
## [1] "Old names contained duplicated elements 14"
## [1] "layout active, nr.Col ignored."
## [1] "1 ERBB4 SCGN"
## [1] "OutDir defined as:"
## 0.773 sec elapsed
## [1] "TopGenes.umaps-umap/"
## [1] "ParentDir defined as:"
## [1] "OutDir defined as:"
## [1] "b.Subdirname defined as:"
## check.genes: 0.006 sec elapsed
## [1] "Old names contained duplicated elements 15"
## [1] "layout active, nr.Col ignored."
## [1] "1 GRIA4 CRABP1"
## [1] "OutDir defined as:"
## 0.846 sec elapsed
## [1] "TopGenes.umaps-umap/"
## [1] "ParentDir defined as:"
## [1] "OutDir defined as:"
## [1] "b.Subdirname defined as:"
## check.genes: 0.006 sec elapsed
## [1] "Old names contained duplicated elements 16"
## [1] "layout active, nr.Col ignored."
## [1] "1 OPCML KAZN"
## [1] "OutDir defined as:"
## 0.787 sec elapsed
## [1] "TopGenes.umaps-umap/"
## [1] "ParentDir defined as:"
## [1] "OutDir defined as:"
## [1] "b.Subdirname defined as:"
## check.genes: 0.007 sec elapsed
## [1] "Old names contained duplicated elements 2"
## [1] "layout active, nr.Col ignored."
## [1] "1 CRYAB CST3"
## [1] "OutDir defined as:"
## 0.751 sec elapsed
## [1] "TopGenes.umaps-umap/"
## [1] "ParentDir defined as:"
## [1] "OutDir defined as:"
## [1] "b.Subdirname defined as:"
## check.genes: 0.007 sec elapsed
## [1] "Old names contained duplicated elements 3"
## [1] "layout active, nr.Col ignored."
## [1] "1 MYC HES6"
## [1] "OutDir defined as:"
## 0.764 sec elapsed
## [1] "TopGenes.umaps-umap/"
## [1] "ParentDir defined as:"
## [1] "OutDir defined as:"
## [1] "b.Subdirname defined as:"
## check.genes: 0.007 sec elapsed
## [1] "Old names contained duplicated elements 4"
## [1] "layout active, nr.Col ignored."
## [1] "1 AC092957.1 APOE"
## [1] "OutDir defined as:"
## 0.815 sec elapsed
## [1] "TopGenes.umaps-umap/"
## [1] "ParentDir defined as:"
## [1] "OutDir defined as:"
## [1] "b.Subdirname defined as:"
## check.genes: 0.006 sec elapsed
## [1] "Old names contained duplicated elements 5"
## [1] "layout active, nr.Col ignored."
## [1] "1 NNAT AL589740.1"
## [1] "OutDir defined as:"
## 0.817 sec elapsed
## [1] "TopGenes.umaps-umap/"
## [1] "ParentDir defined as:"
## [1] "OutDir defined as:"
## [1] "b.Subdirname defined as:"
## check.genes: 0.007 sec elapsed
## [1] "Old names contained duplicated elements 6"
## [1] "layout active, nr.Col ignored."
## [1] "1 TTR COL3A1"
## [1] "OutDir defined as:"
## 0.906 sec elapsed
## [1] "TopGenes.umaps-umap/"
## [1] "ParentDir defined as:"
## [1] "OutDir defined as:"
## [1] "b.Subdirname defined as:"
## check.genes: 0.007 sec elapsed
## [1] "Old names contained duplicated elements 7"
## [1] "layout active, nr.Col ignored."
## [1] "1 MT1X BNIP3"
## [1] "OutDir defined as:"
## 0.97 sec elapsed
## [1] "TopGenes.umaps-umap/"
## [1] "ParentDir defined as:"
## [1] "OutDir defined as:"
## [1] "b.Subdirname defined as:"
## check.genes: 0.01 sec elapsed
## [1] "Old names contained duplicated elements 8"
## [1] "layout active, nr.Col ignored."
## [1] "1 CNTNAP2 KCNQ3"
## [1] "OutDir defined as:"
## 1.187 sec elapsed
## [1] "TopGenes.umaps-umap/"
## [1] "ParentDir defined as:"
## [1] "OutDir defined as:"
## [1] "b.Subdirname defined as:"
## check.genes: 0.01 sec elapsed
## [1] "Old names contained duplicated elements 9"
## [1] "layout active, nr.Col ignored."
## [1] "1 IGFBP2 PGK1"
## [1] "OutDir defined as:"
## 1.102 sec elapsed
qQC.plots.BrainOrg()
The qQC.plots.BrainOrg
function generates and arranges UMAP plots for specified quality control (QC) features from a Seurat object on an A4 page, providing a quick overview of quality control metrics specific to brain organization data.
qQC.plots.BrainOrg(
obj = combined.obj,
QC.Features = c("nFeature_RNA", "percent.ribo", "percent.mito", "nCount_RNA"),
nrow = 2,
ncol = 2
)
## [1] "/Users/abel.vertesy/Documents/Analysis_R/Vignette_SU/QC.markers.4.UMAP_nFeature_RNA_percent.ribo_percent.mito_nCount_RNA.png"
qMarkerCheck.BrainOrg()
The qMarkerCheck.BrainOrg
function generates plots for a predefined or custom set of gene markers within brain organoids, facilitating the quick assessment of their expression across different cells or clusters.
qMarkerCheck.BrainOrg(combined.obj)
## dl-EN ul-EN Immature neurons
## "KAZN" "SATB2" "SLA"
## Interneurons Interneurons Interneurons
## "DLX6-AS1" "ERBB4" "SCGN"
## Intermediate progenitor S-phase G2M-phase
## "EOMES" "TOP2A" "H4C3"
## oRG Astrocyte Hypoxia/Stress
## "HOPX" "S100B" "DDIT4"
## Choroid.Plexus Low-Quality Mesenchyme
## "TTR" "POLR2A" "DCN"
## Glycolytic
## "PDK1"
## [,1]
## dl-EN "KAZN"
## ul-EN "SATB2"
## Immature neurons "SLA"
## Interneurons...4 "DLX6-AS1"
## Interneurons...5 "ERBB4"
## Interneurons...6 "SCGN"
## Intermediate progenitor "EOMES"
## S-phase "TOP2A"
## G2M-phase "H4C3"
## oRG "HOPX"
## Astrocyte "S100B"
## Hypoxia/Stress "DDIT4"
## Choroid.Plexus "TTR"
## Low-Quality "POLR2A"
## Mesenchyme "DCN"
## Glycolytic "PDK1"
## [1] "Signature.Genes.Top16-umap/"
## [1] "ParentDir defined as:"
## [1] "OutDir defined as:"
## [1] "b.Subdirname defined as:"
## check.genes: 0.006 sec elapsed
## [1] "Old names contained duplicated elements Interneurons Interneurons"
## [1] "layout active, nr.Col ignored."
## [1] "1 KAZN SATB2 SLA DLX6-AS1 ERBB4 SCGN EOMES TOP2A"
## [1] "2 HOPX S100B DDIT4 TTR POLR2A DCN PDK1"
## [1] "OutDir defined as:"
## 4.049 sec elapsed
PlotTopGenes()
The PlotTopGenes
function generates UMAP plots showcasing the highest expressed genes, saving the plots in a subfolder. Prior execution of calc.q99.Expression.and.set.all.genes
is required for this function.
combined.obj <- calc.q99.Expression.and.set.all.genes(obj = combined.obj,
obj.version = 3 # You will not needed unless updating via `SeuratObject::UpdateSeuratObject()`.
)
## [1] "Calculating Gene Quantiles"
## [1] "59.6% or 15915 of 26690 genes have q99 expr. > 0 (in 25 cells)."
## [1] "/Users/abel.vertesy/Documents/Analysis_R/Vignette_SU/Gene.expression.in.the.99th.quantile.in.combined.obj.combined.obj.hist.pdf"
## calc.q99.Expression.and.set.all.genes: 0.21 sec elapsed
## 1.002 sec elapsed
## [1] "Quantile 0.99 is now stored under obj@misc$all.genes and $ expr.q99 Please execute all.genes <- obj@misc$all.genes."
PlotTopGenes(obj = combined.obj, n = 4)
## [1] "Highest.Expressed.Genes-umap/"
## [1] "ParentDir defined as:"
## [1] "OutDir defined as:"
## [1] "b.Subdirname defined as:"
## check.genes: 0.007 sec elapsed
## [1] "layout active, nr.Col ignored."
## [1] "1 MALAT1 TMSB4X MT-CO1 FTH1"
## [1] "OutDir defined as:"
## 1.235 sec elapsed
save2plots.A4()
The save2plots.A4
function arranges and saves two plots, such as UMAP plots or any other types of plots, side-by-side or one above the other on a single A4 page.
p1 <- ggplot(iris, aes(Sepal.Length, Sepal.Width, color = Species)) +
geom_point()
p2 <- ggplot(iris, aes(Petal.Length, Petal.Width, color = Species)) +
geom_point()
st.malo <- list(p1, p2)
save2plots.A4(plot_list = st.malo)
## [1] "Saved as: st.malo"
save4plots.A4()
The save4plots.A4
function arranges and saves four plots, such as UMAPs or any other visualizations, onto a single A4 page, facilitating a compact comparison of different visualizations or clustering results.
p1 <- ggplot(iris, aes(Sepal.Length, Sepal.Width, color = Species)) +
geom_point()
p2 <- ggplot(mtcars, aes(mpg, disp, color = as.factor(cyl))) +
geom_point()
p3 <- ggplot(mpg, aes(displ, hwy, color = class)) +
geom_point()
p4 <- ggplot(diamonds, aes(carat, price, color = cut)) +
geom_point()
nantes <- list(p1, p2, p3, p4)
save4plots.A4(plot_list = nantes)
## [1] "Saved as: nantes"
qqSaveGridA4()
The qqSaveGridA4
function saves a grid of 2 or 4 ggplot objects onto an A4 page, enabling efficient visualization arrangements for analysis or presentation purposes.
p1 <- ggplot(iris, aes(Sepal.Length, Sepal.Width, color = Species)) +
geom_point()
p2 <- ggplot(iris, aes(Petal.Length, Petal.Width, color = Species)) +
geom_point()
pl <- list(p1,p2)
qqSaveGridA4(plotlist = pl, plots = 1:2, fname = "Fractions.per.Cl.png")
## [1] "2 plots found, 1 2 are saved."
## [1] "/Users/abel.vertesy/Documents/Analysis_R/Vignette_SU/Fractions.per.Cl.png"
qSeuViolin
generates a violin plot for a specified feature in a Seurat object, allowing for the data to be split by a specified grouping variable, with support for customization options such as logarithmic scaling, custom titles, and more.
qSeuViolin(obj = combined.obj, feature = "nFeature_RNA", caption = "Test")
## [1] "/Users/abel.vertesy/Documents/Analysis_R/Vignette_SU/nFeature_RNA.by.cl.names.top.gene.res.0.2.logY.violin.png"
## 0.938 sec elapsed
getClusterColors()
The getClusterColors
function retrieves and, if desired, displays the color scheme linked to clusters in a Seurat object based on a specified identity column.
colors <- head(getClusterColors(ident = GetClusteringRuns(combined.obj)[1]), 10)
## c("integrated_snn_res.0.1", "integrated_snn_res.0.2", "integrated_snn_res.0.3",
## "integrated_snn_res.0.4", "integrated_snn_res.0.5")
SeuratColorVector()
The SeuratColorVector
function extracts and, if specified, displays the color scheme associated with cluster identities within a Seurat object, ensuring consistent color representation in visualizations.
head(SeuratColorVector(ident = "integrated_snn_res.0.2", plot.colors = TRUE), 4)
## [1] "integrated_snn_res.0.2"
## ident.vec
## 0 1 2 3 4 5 6 7
## 619 390 300 298 254 229 202 208
## [1] "#CD9600" "#00BE67" "#7CAE00" "#00A9FF"
getDiscretePaletteObj()
The getDiscretePaletteObj
function creates a discrete color palette for visualizing clusters in a Seurat object, adjusting the palette size based on a specified identity column to accommodate the number of unique clusters.
colors <- getDiscretePaletteObj(ident.used = "integrated_snn_res.0.1", obj = combined.obj)
print(colors)
## 0 2 3 4 1
## "#AA0DFE" "#3283FE" "#85660D" "#782AB6" "#565656"
gg_color_hue()
The gg_color_hue
function produces a vector of colors mimicking the default color palette of ggplot2, facilitating the creation of color sets for custom plotting functions or other applications requiring a similar aesthetic.
print(gg_color_hue(5))
## [1] "#F8766D" "#A3A500" "#00BF7D" "#00B0F6" "#E76BF3"
DiscretePaletteSafe()
The DiscretePaletteSafe
function generates a discrete color palette excluding any NA values, making it suitable for visualizations requiring a fixed number of distinct and reproducible colors.
colors <- DiscretePaletteSafe(n = 10)
print(colors)
## [1] "#AA0DFE" "#3283FE" "#85660D" "#782AB6" "#565656" "#1C8356" "#16FF32"
## [8] "#F7E1A0" "#E2E2E2" "#1CBE4F"
plot3D.umap()
The plot3D.umap
function plots a 3D UMAP (Uniform Manifold Approximation and Projection) based on one of the metadata columns of a Seurat object. It utilizes Plotly for interactive visualization.
# plot3D.umap(combined.obj, category = "Phase")
plot3D.umap.gene()
plot3D.umap.gene
plots a three-dimensional UMAP with gene expression using the Plotly library.
# plot3D.umap.gene(obj = combined.obj, gene = "TOP2A")
GetClusteringRuns()
The GetClusteringRuns
function retrieves metadata column names associated with clustering runs.
head(getClusterColors(obj = combined.obj, ident = GetClusteringRuns(combined.obj)[1]), 4)
## c("integrated_snn_res.0.1", "integrated_snn_res.0.2", "integrated_snn_res.0.3",
## "integrated_snn_res.0.4", "integrated_snn_res.0.5")
## 0 0 2 3
## "#0000FF" "#0000FF" "#00FF00" "#000033"
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.