dittoDimPlot: Shows data overlayed on a tsne, pca, or similar type of plot

Description Usage Arguments Details Value Many characteristics of the plot can be adjusted using discrete inputs Additional Features Author(s) See Also Examples

View source: R/DittoDimPlot.R

Description

Shows data overlayed on a tsne, pca, or similar type of plot

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
dittoDimPlot(
  object,
  var,
  reduction.use = .default_reduction(object),
  size = 1,
  opacity = 1,
  dim.1 = 1,
  dim.2 = 2,
  cells.use = NULL,
  shape.by = NULL,
  split.by = NULL,
  extra.vars = NULL,
  split.nrow = NULL,
  split.ncol = NULL,
  assay = .default_assay(object),
  slot = .default_slot(object),
  adjustment = NULL,
  color.panel = dittoColors(),
  colors = seq_along(color.panel),
  shape.panel = c(16, 15, 17, 23, 25, 8),
  show.others = TRUE,
  show.axes.numbers = TRUE,
  show.grid.lines = if (is.character(reduction.use)) {     !grepl("umap|tsne",
    tolower(reduction.use)) } else {     TRUE },
  min.color = "#F0E442",
  max.color = "#0072B2",
  min = NULL,
  max = NULL,
  order = c("unordered", "increasing", "decreasing"),
  main = "make",
  sub = NULL,
  xlab = "make",
  ylab = "make",
  rename.var.groups = NULL,
  rename.shape.groups = NULL,
  theme = theme_bw(),
  do.letter = FALSE,
  do.ellipse = FALSE,
  do.label = FALSE,
  labels.size = 5,
  labels.highlight = TRUE,
  labels.repel = TRUE,
  labels.split.by = split.by,
  do.hover = FALSE,
  hover.data = var,
  hover.assay = .default_assay(object),
  hover.slot = .default_slot(object),
  hover.adjustment = NULL,
  add.trajectory.lineages = NULL,
  add.trajectory.curves = NULL,
  trajectory.cluster.meta,
  trajectory.arrow.size = 0.15,
  do.contour = FALSE,
  contour.color = "black",
  contour.linetype = 1,
  legend.show = TRUE,
  legend.size = 5,
  legend.title = "make",
  legend.breaks = waiver(),
  legend.breaks.labels = waiver(),
  shape.legend.size = 5,
  shape.legend.title = shape.by,
  do.raster = FALSE,
  raster.dpi = 300,
  data.out = FALSE
)

Arguments

object

A Seurat, SingleCellExperiment, or SummarizedExperiment object.

var

String name of a "gene" or "metadata" (or "ident" for a Seurat object) to use for coloring the plots. This is the data that will be displayed for each cell/sample. Discrete or continuous data both work.

Alternatively, can be a vector of same length as there are cells/samples in the object.

reduction.use

String, such as "pca", "tsne", "umap", or "PCA", etc, which is the name of a dimensionality reduction slot within the object, and which sets what dimensionality reduction space within the object to use.

Default = the first dimensionality reduction slot inside the object with "umap", "tsne", or "pca" within its name, (priority: UMAP > t-SNE > PCA) or the first dimensionality reduction slot if none of those exist.

Alternatively, a matrix (or data.frame) containing the dimensionality reduction embeddings themselves. The matrix should have as many rows as there are cells/samples in the object. Note that dim.1 and dim.2 will still be used to select which columns to pull from, and column names will serve as the default xlab & ylab.

size

Number which sets the size of data points. Default = 1.

opacity

Number between 0 and 1. Great for when you have MANY overlapping points, this sets how solid the points should be: 1 = not see-through at all. 0 = invisible. Default = 1. (In terms of typical ggplot variables, = alpha)

dim.1

The component number to use on the x-axis. Default = 1

dim.2

The component number to use on the y-axis. Default = 2

cells.use

String vector of cells'/samples' names OR an integer vector specifying the indices of cells/samples which should be included.

Alternatively, a Logical vector, the same length as the number of cells in the object, which sets which cells to include.

shape.by

Variable for setting the shape of cells/samples in the plot. Note: must be discrete. Can be the name of a gene or meta-data. Alternatively, can be "ident" for clusters of a Seurat object. Alternatively, can be a numeric of length equal to the total number of cells/samples in object.

Note: shapes can be harder to see, and to process mentally, than colors. Even as a color blind person myself writing this code, I recommend use of colors for variables with many discrete values.

split.by

1 or 2 strings naming discrete metadata to use for splitting the cells/samples into multiple plots with ggplot faceting.

When 2 metadatas are named, c(row,col), the first is used as rows and the second is used for columns of the resulting grid.

When 1 metadata is named, shape control can be achieved with split.nrow and split.ncol

extra.vars

String vector providing names of any extra metadata to be stashed in the dataframe supplied to ggplot(data).

Useful for making custom splitting/faceting or other additional alterations after dittoSeq plot generation.

split.nrow, split.ncol

Integers which set the dimensions of faceting/splitting when a single metadata is given to split.by.

assay, slot

single strings or integer that set which data to use when plotting gene expression. See gene for more information.

adjustment

When plotting gene expression (or antibody, or other forms of counts data), should that data be used directly (default) or should it be adjusted to be

  • "z-score": scaled with the scale() function to produce a relative-to-mean z-score representation

  • "relative.to.max": divided by the maximum expression value to give percent of max values between [0,1]

color.panel

String vector which sets the colors to draw from. dittoColors() by default, see dittoColors for contents.

colors

Integer vector, the indexes / order, of colors from color.panel to actually use.

Useful for quickly swapping the colors of nearby clusters.

shape.panel

Vector of integers corresponding to ggplot shapes which sets what shapes to use. When discrete groupings are supplied by shape.by, this sets the panel of shapes. When nothing is supplied to shape.by, only the first value is used. Default is a set of 6, c(16,15,17,23,25,8), the first being a simple, solid, circle.

Note: Unfortunately, shapes can be hard to see when points are on top of each other & they are more slowly processed by the brain. For these reasons, even as a color blind person myself writing this code, I recommend use of colors for variables with many discrete values.

show.others

Logical. Whether other cells should be shown in the background in light gray. Default = TRUE.

show.axes.numbers

Logical which controls whether the axes values should be displayed.

show.grid.lines

Logical which sets whether gridlines of the plot should be shown. They are removed when set to FALSE. Default = TRUE for umap and tsne reduction.use, FALSE otherwise.

min.color

color for lowest values of var/min. Default = yellow

max.color

color for highest values of var/max. Default = blue

min

Number which sets the value associated with the minimum color.

max

Number which sets the value associated with the maximum color.

order

String. If the data should be plotted based on the order of the color data, sets whether to plot in "increasing" or "decreasing" order.

main

String, sets the plot title. Default title is automatically generated if not given a specific value. To remove, set to NULL.

sub

String, sets the plot subtitle

xlab, ylab

Strings which set the labels for the axes. Default labels are generated if you do not give this a specific value. To remove, set to NULL.

rename.var.groups

String vector which sets new names for the identities of var groups.

rename.shape.groups

String vector which sets new names for the identities of shape.by groups.

theme

A ggplot theme which will be applied before dittoSeq adjustments. Default = theme_bw(). See https://ggplot2.tidyverse.org/reference/ggtheme.html for other options and ideas.

do.letter

Logical which sets whether letters should be added on top of the colored dots. For extended colorblindness compatibility. NOTE: do.letter is ignored if do.hover = TRUE or shape.by is provided a metadata because lettering is incompatible with plotly and with changing the dots' to be different shapes.

do.ellipse

Logical. Whether the groups should be surrounded by median-centered ellipses.

do.label

Logical. Whether to add text labels near the center (median) of clusters for grouping vars.

labels.size

Size of the the labels text

labels.highlight

Logical. Whether the labels should have a box behind them

labels.repel

Logical, that sets whether the labels' placements will be adjusted with ggrepel to avoid intersections between labels and plot bounds. TRUE by default.

labels.split.by

String of one or two metadata names which controls the facet-split calculations for label placements. Defaults to split.by, so generally there is no need to adjust this except when you are utilizing the extra.vars input to achieve manual faceting control.

do.hover

Logical which controls whether the output will be converted to a plotly object so that data about individual points will be displayed when you hover your cursor over them. hover.data argument is used to determine what data to use.

hover.data

String vector of gene and metadata names, example: c("meta1","gene1","meta2") which determines what data to show on hover when do.hover is set to TRUE.

hover.assay, hover.slot, hover.adjustment

Similar to the non-hover versions of these inputs, when showing expression data upon hover, these set what data will be shown.

add.trajectory.lineages

List of vectors representing trajectory paths, each from start-cluster to end-cluster, where vector contents are the names of clusters provided in the trajectory.cluster.meta input.

If the slingshot package was used for trajectory analysis, you can provide add.trajectory.lineages = slingLineages('object').

add.trajectory.curves

List of matrices, each representing coordinates for a trajectory path, from start to end, where matrix columns represent x (dim.1) and y (dim.2) coordinates of the paths.

Alternatively, a list of lists(/princurve objects) can be provided. Thus, if the slingshot package was used for trajectory analysis, you can provide add.trajectory.curves = slingCurves('object')

trajectory.cluster.meta

String name of metadata containing the clusters that were used for generating trajectories. Required when plotting trajectories using the add.trajectory.lineages method. Names of clusters inside the metadata should be the same as the contents of add.trajectory.lineages vectors.

trajectory.arrow.size

Number representing the size of trajectory arrows, in inches. Default = 0.15.

do.contour

Logical. Whether density-based contours should be displayed.

contour.color

String that sets the color(s) of the do.contour contours.

contour.linetype

String or numeric which sets the type of line used for do.contour contours. Defaults to "solid", but see linetype for other options.

legend.show

Logical. Whether the legend should be displayed. Default = TRUE.

legend.size

Number representing the size at which color legend shapes should be plotted (for discrete variable plotting) in the color legend. Default = 5. *Enlarging the colors legend is incredibly helpful for making colors more distinguishable by color blind individuals.

legend.title

String which sets the title for the color legend. Default = NULL normally, but var when a shape legend will also be shown.

legend.breaks

Numeric vector which sets the discrete values to show in the color-scale legend for continuous data.

legend.breaks.labels

String vector, with same length as legend.breaks, which renames what's displayed next to the tick marks of the color-scale.

shape.legend.size

Number representing the size at which shapes should be plotted in the shape legend.

shape.legend.title

String which sets the title of the shapes legend. Default is shape.by

do.raster

Logical. When set to TRUE, rasterizes the internal plot area. Useful for editing in external programs (e.g. Illustrator).

raster.dpi

Number indicating dpi to use for rasterization. Default = 300.

data.out

Logical. When set to TRUE, changes the output, from the plot alone, to a list containing the plot ("p"), a data.frame containing the underlying data for target cells ("Target_data"), and a data.frame containing the underlying data for non-target cells ("Others_data").

Note: do.hover plotly conversion is turned off in this setting, but hover.data is still calculated.

Details

The function creates a dataframe containing the metadata or expression data associated with the given var (or if a vector of data is provided directly, it just uses that), plus X and Y coordinates data determined by the reduction.use and dim.1 (x-axis) and dim.2 (y-axis) inputs. Any extra data requested with shape.by, split.by or extra.var is added as well. For expression/counts data, assay, slot, and adjustment inputs can be used to change which data is used, and if it should be adjusted in some way.

Next, if a set of cells or samples to use is indicated with the cells.use input, then the dataframe is split into Target_data and Others_data based on subsetting by the target cells/samples.

Finally, a scatter plot is then created using these dataframes where non-target cells will be displayed in gray if show.others=TRUE, and target cell data is displayed on top, colored based on the var-associated data, and with shapes determined by the shape.by-associated data. If split.by was used, the plot will be split into a matrix of panels based on the associated groupings.

Value

A ggplot or plotly object where colored dots (or other shapes) are overlayed onto a tSNE, PCA, UMAP, ..., plot of choice.

Alternatively, if data.out=TRUE, a list containing three slots is output: the plot (named 'p'), a data.table containing the underlying data for target cells (named 'Target_data'), and a data.table containing the underlying data for non-target cells (named 'Others_data').

Alternatively, if do.hover is set to TRUE, the plot is coverted from ggplot to plotly & cell/sample information, determined by the hover.data input, is retrieved, added to the dataframe, and displayed upon hovering the cursor over the plot.

Many characteristics of the plot can be adjusted using discrete inputs

Additional Features

Many other tweaks and features can be added as well. Each is accessible through 'tab' autocompletion starting with "do."--- or "add."---, and if additional inputs are involved in implementing or tweaking these, the associated inputs will start with the "---.":

Author(s)

Daniel Bunis and Jared Andrews

See Also

getGenes and getMetas to see what the var, split.by, etc. options are of an object.

getReductions to see what the reduction.use options are of an object.

importDittoBulk for how to create a SingleCellExperiment object from bulk seq data that dittoSeq functions can use & addDimReduction for how to specifically add calculated dimensionality reductions that dittoDimPlot can utilize.

dittoScatterPlot for showing very similar data representations, but where genes or metadata are wanted as the axes.

dittoDimHex and dittoScatterHex for showing very similar data representations, but where nearby cells are summarized together in hexagonal bins.

dittoPlot for an alternative continuous data display method where data broken into discrete groupings is shown on a y- (or x-) axis.

dittoBarPlot for an alternative discrete data display and quantification method.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
example(importDittoBulk, echo = FALSE)
myRNA

# Display discrete data:
dittoDimPlot(myRNA, "clustering")
# Display continuous data:
dittoDimPlot(myRNA, "gene1")

# To show currently set clustering for seurat objects, you can use "ident".
# To change the dimensional reduction type, use 'reduction.use'.
dittoDimPlot(myRNA, "clustering",
    reduction.use = "pca",
    dim.1 = 3,
    dim.2 = 4)

# Subset to certain cells with cells.use
dittoDimPlot(myRNA, "clustering",
    cells.us = !myRNA$SNP)

# Data can also be split in other ways with 'shape.by' or 'split.by'
dittoDimPlot(myRNA, "gene1",
    shape.by = "clustering",
    split.by = "SNP") # single split.by element
dittoDimPlot(myRNA, "gene1",
    split.by = c("groups","SNP")) # row and col split.by elements

# Modify the look with intuitive inputs
dittoDimPlot(myRNA, "clustering",
    size = 2, opacity = 0.7, show.axes.numbers = FALSE,
    ylab = NULL, xlab = "tSNE",
    main = "Plot Title",
    sub = "subtitle",
    legend.title = "clustering")

# MANY addtional tweaks are possible.
# Also, many extra features are easy to add as well:
dittoDimPlot(myRNA, "clustering",
    do.label = TRUE, do.ellipse = TRUE)
dittoDimPlot(myRNA, "clustering",
    do.label = TRUE, labels.highlight = FALSE, labels.size = 8)
if (requireNamespace("plotly", quietly = TRUE)) {
    dittoDimPlot(myRNA, "gene1", do.hover = TRUE,
        hover.data = c("gene2", "clustering", "timepoint"))
}
dittoDimPlot(myRNA, "gene1", add.trajectory.lineages = list(c(1,2,4), c(1,3)),
    trajectory.cluster.meta = "clustering",
    sub = "Pseudotime Trajectories")

dittoDimPlot(myRNA, "gene1",
    do.contour = TRUE,
    contour.color = "lightblue", # Optional, black by default
    contour.linetype = "dashed") # Optional, solid by default

dittoSeq documentation built on April 17, 2021, 6:01 p.m.