| plotgg | R Documentation |
This function works on the output of affinity and uses
ggplot2::ggplot() to generate a heatmap for numeric columns of the
$all dataframe, excluding interval columns (median interval and
confidence intervals) and the confidence level (which is constant across
pairs in a single run).
plotgg(
data,
variable,
legendlimit,
col = NULL,
show.value = NULL,
value.digit = NULL,
text.size = NULL,
text.col = NULL,
plot.margin = NULL,
drop.empty = TRUE,
sig.only = FALSE,
...
)
data |
Output list returned by |
variable |
Name of a numeric column in |
legendlimit |
Either |
col |
Color specification for the fill scale. For |
show.value |
Logical; if |
value.digit |
Number of digits used when printing values; default is 2. |
text.size |
Size of printed values; default is 2.5. |
text.col |
Color of printed values on tiles (used when values are shown). |
plot.margin |
Plot margin passed to |
drop.empty |
Logical; if |
sig.only |
Logical or numeric. If |
... |
Additional arguments (currently unused). |
This function is a wrapper around ggplot2 with carefully chosen
defaults to generate an interpretable heatmap of pairwise associations.
The plot shows the lower triangle of an N \times N matrix (diagonal
excluded), where both rows and columns represent the same set of entities.
The upper triangle is omitted because it is a mirror image of the lower
triangle.
By default (drop.empty = TRUE), entities whose values are entirely
NA for the selected variable are removed from both axes. This
avoids plotting empty rows and columns when an entity has no usable values
(e.g., due to degenerate distributions or missing data). Set
drop.empty = FALSE to retain all entities and reproduce the full grid,
including empty rows or columns.
If sig.only is enabled, values of the selected variable are
masked to NA wherever p_value exceeds the specified cutoff, so
only statistically significant tiles are shown. Use sig.only = TRUE
to apply the default cutoff (0.05), or supply a numeric cutoff (e.g.,
sig.only = 0.01). Requires a p_value column in data$all.
When variable = "p_value", p-values above the cutoff are masked to
NA.
Legend titles are mapped to human-readable labels (some shown on two lines),
rather than using raw column names from data$all.
The plot can be requested using column names from the $all dataframe
returned by affinity. Additional ggplot2 layers or theme
modifications can be added by appending them with +, as in standard
ggplot2 usage.
The legendlimit argument controls how the color scale is defined.
For alpha_mle, the default midpoint is 0 (null expectation), and the
color scale can be either data-driven ("datarange") or symmetrically
balanced around zero ("balanced"), using the maximum absolute value
observed. For indices bounded in [0,1] (p_value, jaccard,
sorensen, simpson), the balanced scale uses fixed limits
[0,1]. For p_value, the color mapping is reversed so smaller
p-values appear more intense. For count-based variables, no natural midpoint exists; the
color scale spans the observed range. For obs_cooccur_X and
exp_cooccur, a shared color scale is applied so the two plots are
visually comparable.
When show.value = TRUE, numeric values are printed on each tile using
ggplot2::geom_text(). If show.value = NULL (default), values are
printed automatically when the number of plotted entities is \le 20.
Rounding and text appearance are controlled by value.digit,
text.size, and text.col.
A heatmap plot generated with ggplot2.
Kumar Mainali
affinity
data(finches)
head(finches)
library(ggplot2)
# the remainder of the script has been enclosed under \donttest{}
# to bypass the CRAN's 5 second limit on example files
# --------------------------------------------------------------
# plotting various variables
# ---------------------------------------------
# compute alpha and other quantities for island-pair affinity (beta diversity)
# the square matrices are not used for plotting
myout <- affinity(data = finches, row.or.col = "col")
# myout
plotgg(data = myout, variable = "alpha_mle", legendlimit = "datarange")
# in the example above, null expectation of the alpha_mle (=0) has white color,
# and negative and positive values stretch between "#87beff" and "#fd6a6c", respectively
# so that the color spectrum is applied NOT to the range of data
# but to the same extent of values
# on both sides of zero, which is max(abs(valrange)) and -(max(abs(valrange))).
# however, the legend can be printed to show the extent of data with "datarange"
# or the entire spectrum where the color is applied with "balanced".
plotgg(data = myout, variable = "alpha_mle", legendlimit = "balanced")
# notice that the two plots above are identical but the legend has
# different range with the same color scale.
plotgg(data = myout, variable = "sorensen", legendlimit = "balanced")
plotgg(data = myout, variable = "jaccard", legendlimit = "balanced")
# in the case of observed and expected cooccurrences, one color scale is applied for both plots
# so that the shades of color across plots can be visually compared
plotgg(data = myout, variable = "exp_cooccur", legendlimit = "datarange")
plotgg(data = myout, variable = "exp_cooccur", legendlimit = "balanced")
plotgg(data = myout, variable = "obs_cooccur_X", legendlimit = "balanced")
plotgg(data = myout, variable = "entity_1_count_mA", legendlimit = "datarange")
plotgg(data = myout, variable = "entity_2_count_mB", legendlimit = "datarange")
plotgg(data = myout, variable = "total_N", legendlimit = "datarange")
# for "entity_1_count_mA", "entity_2_count_mB", "sites_total_N",
# if legendlimit is set to "balanced", it will be changed to "datarange"
plotgg(data = myout, variable = "entity_2_count_mB", legendlimit = "balanced")
# plot only statistically significant tiles (based on p_value)
# -----------------------------------------------------------
# sig.only = TRUE masks non-significant tiles (p_value > 0.05) to NA
plotgg(data = myout, variable = "alpha_mle", legendlimit = "balanced", sig.only = TRUE)
# you can also supply a stricter p-value cutoff (e.g., 0.01)
plotgg(data = myout, variable = "alpha_mle", legendlimit = "balanced", sig.only = 0.01)
# change color of the plot and text
# ---------------------------------------------
plotgg(data = myout, variable = "alpha_mle", legendlimit = "balanced")
plotgg(data = myout, variable = "alpha_mle", legendlimit = "balanced",
col = c('#99cc33', 'black', '#ff9933'), text.col = "white")
plotgg(data = myout, variable = "alpha_mle", legendlimit = "balanced",
col = c('#99cc33', '#ff9933'), text.col = "white")
plotgg(data = myout, variable = "obs_cooccur_X", legendlimit = "balanced")
plotgg(data = myout, variable = "obs_cooccur_X", legendlimit = "balanced",
col = c('black', 'red'), text.col = "white")
# change the characteristics of text printed in the plot
# ------------------------------------------------------
plotgg(data = myout, variable = "alpha_mle", legendlimit = "balanced")
# change the number of digits; the default is 2
plotgg(data = myout, variable = "alpha_mle", legendlimit = "balanced", value.digit = 3)
# make the fonts bigger; the default is 2.5
plotgg(data = myout, variable = "alpha_mle", legendlimit = "balanced", text.size = 3.5)
# hide values from the plot
plotgg(data = myout, variable = "alpha_mle", legendlimit = "balanced", show.value = FALSE)
# increase or decrease margin
# ---------------------------------------------
myout <- affinity(data = finches, row.or.col = "row")
# myout
plotgg(data = myout, variable = "alpha_mle", legendlimit = "balanced")
plotgg(data = myout, variable = "alpha_mle", legendlimit = "balanced",
plot.margin = ggplot2::margin(1,1,5,2, "cm"))
# change angle of x-axis tick label; the default is 35 degrees
# ------------------------------------------------------------
plotgg(data = myout, variable = "alpha_mle", legendlimit = "balanced")
plotgg(data = myout, variable = "alpha_mle", legendlimit = "balanced") +
ggplot2::theme(axis.text.x = element_text(angle = 45))
# to change to 90 degrees, adjust vjust
# bad ->
plotgg(data = myout, variable = "alpha_mle", legendlimit = "balanced") +
ggplot2::theme(axis.text.x = element_text(angle = 90))
# good ->
plotgg(data = myout, variable = "alpha_mle", legendlimit = "balanced") +
ggplot2::theme(axis.text.x = element_text(angle = 90, vjust = 0.5))
# additional elements in the plot
# ----------------------------------
# because it is ggplot output, you can use the arguments of ggplot() to make changes
# add plot title and change legend title
plotgg(data = myout, variable = "alpha_mle", legendlimit = "balanced") +
ggplot2::theme(axis.text.x = element_text(angle = 90, vjust = 0.5)) +
ggplot2::ggtitle("Affinity of island pairs measured with Alpha MLE") +
ggplot2::labs(fill = 'My Personal\nTitle')
# show/hide entities that are entirely empty (all-NA tiles)
# --------------------------------------------------------
# Here we create an artificial "empty" entity by setting one column to NA.
# This guarantees that all pairwise comparisons involving that entity have no usable data,
# so the corresponding tiles become NA for variables such as alpha_mle.
finches2 <- as.matrix(finches)
storage.mode(finches2) <- "numeric"
finches2[, 3] <- NA_real_ # make the first entity entirely missing (choose any column)
myout2 <- affinity(data = finches2, row.or.col = "col")
# Default behavior: drop.empty = TRUE (empty entity removed from the axes)
plotgg(data = myout2, variable = "alpha_mle", legendlimit = "balanced")
# Keep empty entities (legacy/full grid): shows the empty row/column
plotgg(data = myout2, variable = "alpha_mle", legendlimit = "balanced", drop.empty = FALSE)
# keep empty entities even after masking (shows rows/columns with all-NA tiles)
plotgg(data = myout2, variable = "alpha_mle", legendlimit = "balanced",
sig.only = TRUE, drop.empty = FALSE)
# automatic suppression of numeric values on tiles
# -------------------------------------------------
# By default, numeric values are printed on tiles only when the number of
# plotted entities is reasonably small (<= 20). This avoids severe visual
# clutter when the heatmap becomes large.
finches_big <- finches
# duplicate columns to artificially inflate the number of entities
finches_big <- cbind(finches_big, finches_big[, 1:5])
colnames(finches_big)[(ncol(finches) + 1):ncol(finches_big)] <-
paste0(colnames(finches)[1:5], "_dup")
myout_big <- affinity(data = finches_big, row.or.col = "col")
# Numeric values are NOT printed because the number of entities exceeds 20
plotgg(data = myout_big, variable = "alpha_mle", legendlimit = "balanced")
# To force printing numeric values despite the large number of entities:
plotgg(data = myout_big, variable = "alpha_mle", legendlimit = "balanced", show.value = TRUE)
#end of \donttest{}
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.