scatterPlot | R Documentation |
Show RNAseq data overlayed on a scatter plot
scatterPlot(
data_frame,
x.by,
y.by,
color.by = NULL,
shape.by = NULL,
split.by = NULL,
size = 1,
rows.use = NULL,
show.others = TRUE,
x.adjustment = NULL,
y.adjustment = NULL,
color.adjustment = NULL,
x.adj.fxn = NULL,
y.adj.fxn = NULL,
color.adj.fxn = NULL,
split.show.all.others = TRUE,
opacity = 1,
color.panel = dittoColors(),
colors = seq_along(color.panel),
split.nrow = NULL,
split.ncol = NULL,
split.adjust = list(),
multivar.split.dir = c("col", "row"),
shape.panel = c(16, 15, 17, 23, 25, 8),
rename.color.groups = NULL,
rename.shape.groups = NULL,
min.color = "#F0E442",
max.color = "#0072B2",
min.value = NA,
max.value = NA,
plot.order = c("unordered", "increasing", "decreasing", "randomize"),
xlab = x.by,
ylab = y.by,
main = "make",
sub = NULL,
theme = theme_bw(),
do.hover = FALSE,
hover.data = unique(c(color.by, paste0(color.by, ".color.adj"), "color.multi",
"color.which", x.by, paste0(x.by, ".x.adj"), y.by, paste0(y.by, ".y.adj"), shape.by,
split.by)),
hover.round.digits = 5,
do.contour = FALSE,
contour.color = "black",
contour.linetype = 1,
add.trajectory.by.groups = NULL,
add.trajectory.curves = NULL,
trajectory.group.by,
trajectory.arrow.size = 0.15,
add.xline = NULL,
xline.linetype = "dashed",
xline.color = "black",
add.yline = NULL,
yline.linetype = "dashed",
yline.color = "black",
do.letter = FALSE,
do.ellipse = FALSE,
do.label = FALSE,
labels.size = 5,
labels.highlight = TRUE,
labels.repel = TRUE,
labels.repel.adjust = list(),
labels.split.by = split.by,
legend.show = TRUE,
legend.color.title = "make",
legend.color.size = 5,
legend.color.breaks = waiver(),
legend.color.breaks.labels = waiver(),
legend.shape.title = shape.by,
legend.shape.size = 5,
show.grid.lines = TRUE,
do.raster = FALSE,
raster.dpi = 300,
data.out = FALSE
)
data_frame |
A data_frame where columns are features and rows are observations you might wish to visualize. |
x.by , y.by |
Single strings denoting the name of a column of |
color.by |
Single string denoting the name of a column of |
shape.by |
Single string denoting the name of a column of |
split.by |
1 or 2 strings denoting the name(s) of column(s) of When 2 columns are named, c(row,col), the first is used as rows and the second is used for columns of the resulting facet grid. When 1 column is named, shape control can be achieved with |
size |
Number which sets the size of data points. Default = 1. |
rows.use |
String vector of rownames of Alternatively, a Logical vector, the same length as the number of rows in |
show.others |
Logical. TRUE by default, whether rows not targeted by |
x.adjustment , y.adjustment , color.adjustment |
A recognized string indicating whether numeric
Ignored if the target data is not numeric as these known adjustments target numeric data only. In order to leave the unedited data available for use in other features, the adjusted data are put in a new column and that new column is used for plotting. |
x.adj.fxn , y.adj.fxn , color.adj.fxn |
If you wish to apply a function to edit the For example, In order to leave the unedited data available for use in other features, the adjusted data are put in a new column and that new column is used for plotting. |
split.show.all.others |
Logical which sets whether gray "others" points of facets should include all points of other facets ( |
opacity |
Number between 0 and 1. 1 = opaque. 0 = invisible. Default = 1. (In terms of typical ggplot variables, = alpha) |
color.panel |
String vector which sets the colors to draw from when A named vector can be used if names are matched to the distinct values of the |
colors |
Integer vector, the indexes / order, of colors from Useful for quickly swapping around colors of the default set (when not using names for color matching). |
split.nrow , split.ncol |
Integers which set the dimensions of faceting/splitting when faceting by a single feature. |
split.adjust |
A named list which allows extra parameters to be pushed through to the faceting function call. List elements should be valid inputs to the faceting functions, e.g. 'list(scales = "free")'. For options, when giving 1 column to |
multivar.split.dir |
"row" or "col", sets the direction of faceting used for 'var' values when:
|
shape.panel |
Vector of integers, corresponding to ggplot shapes, which sets what shapes to use in conjunction with |
rename.color.groups |
String vector which sets new names for the identities of |
rename.shape.groups |
String vector which sets new names for the identities of |
min.color |
color for |
max.color |
color for |
min.value , max.value |
Number which sets the |
plot.order |
String. If the data should be plotted based on the order of the color data, sets whether to plot in "increasing", "decreasing", or "randomize"d order. |
xlab , ylab |
Strings which set the labels for the axes. To remove, set to |
main |
String, sets the plot title.
A default title is automatically generated based on |
sub |
String, sets the plot subtitle. |
theme |
A ggplot theme which will be applied before internal adjustments.
Default = |
do.hover |
Logical which controls whether the ggplot output will be converted to a plotly object so that data about individual points can be displayed when you hover your cursor over them.
The |
hover.data |
String vector which denotes what data to show for each data point, upon hover, when |
hover.round.digits |
Integer number specifying the number of decimal digits to round displayed numeric values to, when |
do.contour |
Logical. Whether density-based contours should be displayed. |
contour.color |
String that sets the color of the |
contour.linetype |
String or numeric which sets the type of line used for |
add.trajectory.by.groups |
List of vectors representing trajectory paths, each from start-group to end-group, where vector contents are the group-names indicated by the |
add.trajectory.curves |
List of matrices, each representing coordinates for a trajectory path, from start to end, where matrix columns represent x and y coordinates of the paths. |
trajectory.group.by |
String denoting the name of a column of |
trajectory.arrow.size |
Number representing the size of trajectory arrows, in inches. Default = 0.15. |
add.xline |
numeric value(s) where one or multiple vertical line(s) should be added. |
xline.linetype |
String which sets the type of line for |
xline.color |
String that sets the color(s) of the |
add.yline |
numeric value(s) where one or multiple vertical line(s) should be added. |
yline.linetype |
String which sets the type of line for |
yline.color |
String that sets the color(s) of the |
do.letter |
Logical which sets whether letters should be added on top of the colored dots.
For extended colorblindness compatibility.
NOTE: |
do.ellipse |
Logical. Whether |
do.label |
Logical. Whether to add text labels near the center (median) of |
labels.size |
Number which sets the size of labels text when |
labels.highlight |
Logical. Whether labels should have a box behind them when |
labels.repel |
Logical, that sets whether the labels' placements will be adjusted with ggrepel to avoid intersections between labels and plot bounds when |
labels.repel.adjust |
A named list which allows extra parameters to be pushed through to ggrepel function calls.
List elements should be valid inputs to the |
labels.split.by |
String of one or two column names which controls the facet-split calculations for label placements.
Defaults to |
legend.show |
Logical. Whether any legend should be displayed. Default = |
legend.color.title , legend.shape.title |
Strings which set the title for the color or shape legends. |
legend.color.size , legend.shape.size |
Numbers representing the size of shapes in the color and shape legends (for discrete variable plotting). Default = 5. *Enlarging the icons in the colors legend is incredibly helpful for making colors more distinguishable by color blind individuals. |
legend.color.breaks |
Numeric vector which sets the discrete values to label in the color-scale legend for |
legend.color.breaks.labels |
String vector, with same length as |
show.grid.lines |
Logical which sets whether grid lines should be shown within the plot space. |
do.raster |
Logical. When set to |
raster.dpi |
Number indicating dots/pixels per inch (dpi) to use for rasterization. Default = 300. |
data.out |
Logical. When set to |
This function first makes any requested adjustments to data in the given data_frame
, internally only, such as scaling the color.by
-column if color.adjustment
was given "z-score"
.
Next, if a set of rows to target was indicated with the rows.use
input, then the data_frame is split into Target_data
and Others_data
.
Then, rows are reordered to match with the requested plot.order
behavior.
Finally, a scatter plot is created from the resultant data.frames.
Non-target data points are colored in gray if show.others=TRUE
,
and target data points are displayed on top, colored and shaped based on the color.by
- and shape.by
-associated data.
If split.by
was used, the plot will be split into a matrix of panels based on the associated groupings.
a ggplot scatterplot where colored dots and/or shapes represent individual rows of the given data_frame
.
Alternatively, if data.out=TRUE
, a list containing four slots is output:
the plot (named 'p'),
a data.frame containing the underlying data for target rows (named 'Target_data'),
a data.frame containing the underlying data for non-target rows (named 'Others_data'),
and a list providing mappings of final column names in 'Target_data' to given plot aesthetics (named 'cols_used') because modification of newly made columns is required for many features.
Alternatively, if do.hover
is set to TRUE
, the plot is coverted from ggplot to plotly &
additional information about each data point, determined by the hover.data
input, is displayed upon hovering the cursor over the plot.
size
and opacity
can be used to adjust the size and transparency of the data points. size
can be given a number, or a column name of data_frame
.
Colors used can be adjusted with color.panel
and/or colors
for discrete data, or min
, max
, min.color
, and max.color
for continuous data.
Shapes used can be adjusted with shape.panel
.
Color and shape labels can be changed using rename.color.groups
and rename.shape.groups
.
Titles and axes labels can be adjusted with main
, sub
, xlab
, ylab
, and legend.title
arguments.
Legends can also be adjusted in other ways, using variables that all start with "legend.
" for easy tab completion lookup.
Daniel Bunis
scatterHex
for a hex-binned version that can be useful when points are very dense.
example("dittoExampleData", echo = FALSE)
# The minimal inputs for scatterPlot are the 'data_frame', and 2 column names,
# given to 'x.by' and 'y.by', indicating which data to use for the x and y
# axes, respectively.
scatterPlot(
example_df, x.by = "PC1", y.by = "PC2")
# 'color.by' and/or 'shape.by' can also be given column names in order to
# show represent that columns data in the color or shape of the data points.
# 'shape.by' must be pointed to discrete data, but 'color.by' can be given
# discrete or numeric data.
scatterPlot(
example_df, x.by = "PC1", y.by = "PC2",
color.by = "groups",
shape.by = "SNP",
size = 3)
scatterPlot(
example_df, x.by = "PC1", y.by = "PC2",
color.by = "gene1",
size = 3)
# Data can be "split" or faceted by a discrete variable as well.
scatterPlot(example_df, x.by = "PC1", y.by = "PC2", color.by = "gene1",
split.by = "timepoint") # single split.by element
scatterPlot(example_df, x.by = "PC1", y.by = "PC2", color.by = "gene1",
split.by = c("groups","SNP")) # row and col split.by elements
# Modify the look with intuitive inputs
scatterPlot(example_df, x.by = "PC1", y.by = "PC2", color.by = "groups",
size = 5,
opacity = 0.3,
show.grid.lines = FALSE,
ylab = NULL, xlab = "PC2 by PC1",
main = "Plot Title",
sub = "subtitle",
legend.color.title = "Legend\nRetitle")
# You can restrict to only certain data points using the 'rows.use' input.
# The input can be given rownames, indexes, or a logical vector
# All "other" points will now only be shown as a gray background, or will not
# be shown add all if you also add 'show.others = FALSE'
scatterPlot(example_df, x.by = "PC1", y.by = "PC2", color.by = "groups",
sub = "show only first 40 observations, by index",
rows.use = 1:40)
scatterPlot(example_df, x.by = "PC1", y.by = "PC2", color.by = "groups",
sub = "show only 3 observations, by name",
rows.use = c("obs1", "obs2", "obs25"))
scatterPlot(example_df, x.by = "PC1", y.by = "PC2", color.by = "groups",
sub = "show groups A,B,D only, by logical, without others as background",
rows.use = example_df$groups!="C",
show.others = FALSE)
# Many extra features are easy to add as well:
# Each is started via an input starting with 'do.FEATURE*' or 'add.FEATURE*'
# And when tweaks for that feature are possible, those inputs will start be
# named starting with 'FEATURE*'. For example, color.by groups can be labeled
# with 'do.label = TRUE' and the tweaks for this feature are given with inputs
# 'labels.size', 'labels.highlight', and 'labels.repel':
scatterPlot(example_df, x.by = "PC1", y.by = "PC2", color.by = "groups",
sub = "default labeling",
do.label = TRUE) # Turns on the labeling feature
scatterPlot(example_df, x.by = "PC1", y.by = "PC2", color.by = "groups",
sub = "tweaked labeling",
do.label = TRUE, # Turns on the labeling feature
labels.size = 8, # Adjust the text size of labels
labels.highlight = FALSE, # Removes white background behind labels
labels.repel = FALSE) # Turns off anti-overlap location adjustments
# Faceting can also be used to show multiple continuous variables side-by-side
# by giving a vector of column names to 'color.by'.
# This can also be combined with 1 'split.by' variable, with direction then
# controlled via 'multivar.split.dir':
scatterPlot(example_df, x.by = "PC1", y.by = "PC2",
color.by = c("gene1", "gene2"))
scatterPlot(example_df, x.by = "PC1", y.by = "PC2",
color.by = c("gene1", "gene2"),
split.by = "groups")
scatterPlot(example_df, x.by = "PC1", y.by = "PC2",
color.by = c("gene1", "gene2"),
split.by = "groups",
multivar.split.dir = "row")
# Sometimes, it can be useful for external editing or troubleshooting purposes
# to see the underlying data that was directly used for plotting.
# 'data.out = TRUE' can be provided in order to obtain not just plot ("plot"),
# but also the "Target_data" and "Others_data" data.frames and "cols_used"
# returned as a list.
out <- scatterPlot(example_df, x.by = "PC1", y.by = "PC2", color.by = "groups",
rows.use = 1:40,
data.out = TRUE)
out$plot
summary(out$Target_data)
summary(out$Others_data)
out$cols_used
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.