In this vignette we will demonstrate how to visualize single-cell data in genome-browser-track style plots with Signac.
To demonstrate we'll use the human PBMC dataset processed in this vignette.
library(Signac) library(ggplot2) # load PBMC dataset pbmc <- readRDS("../vignette_data/pbmc.rds")
There are several different genome browser style plot types available in Signac, including accessibility tracks, gene annotations, peak coordinate, genomic links, and fragment positions.
The main plotting function in Signac is CoveragePlot()
, and this computes the
averaged frequency of sequenced DNA fragments for different groups of cells
within a given genomic region.
cov_plot <- CoveragePlot( object = pbmc, region = "chr2-87011729-87035519", annotation = FALSE, peaks = FALSE ) cov_plot
We can also request regions of the genome by gene name. This will use the gene coordinates stored in the Seurat object to determine which genomic region to plot
CoveragePlot( object = pbmc, region = "CD8A", annotation = FALSE, peaks = FALSE )
All of the plots returned by Signac functions are ggplot2
or patchwork
objects, and so you can use standard functions from ggplot2
or other packages
to further modify or customize them. For example, if we wanted to adjust the
color of the tracks generated in the plot above, we can use the scale_fill_
functions in ggplot2
, for example scale_fill_brewer()
:
cov_plot + scale_fill_brewer(type = "seq", palette = 1)
cov_plot + scale_fill_brewer(type = "qual", palette = 1)
Note that for plots that are combined using Patchwork, the &
operator can be
used to apply the aesthetic adjustment to all of the plots in the object.
Gene annotations within a given genomic region can be plotted using the
AnnotationPlot()
function.
gene_plot <- AnnotationPlot( object = pbmc, region = "chr2-87011729-87035519" ) gene_plot
Peak coordinates within a genomic region can be plotted using the PeakPlot()
function.
peak_plot <- PeakPlot( object = pbmc, region = "chr2-87011729-87035519" ) peak_plot
Relationships between genomic positions can be plotted using the LinkPlot()
function. This will display an arc connecting two linked positions, with the
transparency of the arc line proportional to a score associated with the link.
These links could be used to encode different things, including regulatory
relationships (for example, linking enhancers to the genes that they regulate),
or experimental data such as Hi-C.
Just to demonstrate how the function works, we've created a fake link here and added it to the PBMC dataset.
library(GenomicRanges) # generate fake links and add to object cd8.coords <- LookupGeneCoords(object = pbmc, gene = "CD8A") links <- GRanges( seqnames = "chr2", ranges = IRanges(start = start(cd8.coords)[1] + 100, end = end(cd8.coords)[1] - 5000), score = 1, group = 1 ) Links(pbmc) <- links
link_plot <- LinkPlot( object = pbmc, region = "chr2-87011729-87035519" ) link_plot
While the CoveragePlot()
function computes an aggregated signal within a
genomic region for different groups of cells, sometimes it's also useful to
inspect the frequency of sequenced fragments within a genomic region for
individual cells, without aggregation. This can be done using the TilePlot()
function.
tile_plot <- TilePlot( object = pbmc, region = "chr2-87011729-87035519", idents = c("CD4 Memory", "CD8 Effector") ) tile_plot
By default, this selects the top 100 cells for each group based on the total number of fragments in the genomic region. The genomic region is then tiled and the total fragments in each tile counted for each cell, and the resulting counts for each position displayed as a heatmap.
Multimodal single-cell datasets generate multiple experimental measurements for
each cell. Several methods now exist that are capable of measuring single-cell
chromatin data (such as chromatin accessibility) alongside other measurements
from the same cell, such as gene expression or mitochondrial genotype. In these
cases it's often informative to visualize the multimodal data together in a
single plot. This can be achieved using the ExpressionPlot()
function. This
is similar to the VlnPlot()
function in Seurat, but is designed to be
incorportated with genomic track plots generated by CoveragePlot()
.
expr_plot <- ExpressionPlot( object = pbmc, features = "CD8A", assay = "RNA" ) expr_plot
We can create similar plots for multiple genes at once simply by passing a list of gene names
ExpressionPlot( object = pbmc, features = c("CD8A", "CD4"), assay = "RNA" )
Above we've demonstrated how to generate individual tracks and panels that can
be combined into a single plot for a single genomic region. These panels can
be easily combined using the CombineTracks()
function.
CombineTracks( plotlist = list(cov_plot, tile_plot, peak_plot, gene_plot, link_plot), expression.plot = expr_plot, heights = c(10, 6, 1, 2, 3), widths = c(10, 1) )
The heights
and widths
parameters control the relative heights and widths of
the individual tracks, according to the order that the tracks appear in the
plotlist
. The CombineTracks()
function ensures that the tracks are aligned
vertically and horizontally, and moves the x-axis labels (describing the
genomic position) to the bottom of the combined tracks.
Above we've shown how to create genomic plot panels individually and how to
combine them. This allows more control over how each panel is constructed and
how they're combined, but involves multiple steps. For convenience, we've
included the ability to generate and combine different panels automatically in
the CoveragePlot()
function, through the annotation
, peaks
, tile
, and
features
arguments. We can generate a similar plot to that shown above in a
single function call:
CoveragePlot( object = pbmc, region = "chr2-87011729-87035519", features = "CD8A", annotation = TRUE, peaks = TRUE, tile = TRUE, links = TRUE )
Notice that in this example we create the tile plot for every group of cells that is shown in the coverage track, whereas above we were able to create a plot that showed the aggregated coverage for all groups of cells and the tile plot for only the CD4 memory cells and the CD8 effector cells. A higher degree of customization is possible when creating each track separately.
Above we demonstrated the different types of plots that can be constructed using
Signac. Often when exploring genomic data it's useful to be able to
interactively browse through different regions of the genome and adjust tracks
on the fly. This can be done in Signac using the CoverageBrowser()
function.
This provides all the same functionality of the CoveragePlot()
function,
except that we can scroll upstream/downstream, zoom in/out of regions, navigate
to new region, and adjust which tracks are shown or how the cells are grouped.
In exploring the data interactively, often you will find interesting plots that
you'd like to save for viewing later. We've included a "Save plot" button that
will add the current plot to a list of plots that is returned when the
interactive session is ended. Here's a recorded demonstration of the
CoverageBrowser()
function:
Session Info
sessionInfo()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.