plot_heatmap: Visualize a Distance or Similarity Matrix as a Heatmap with...

View source: R/plot_heatmap.R

plot_heatmapR Documentation

Visualize a Distance or Similarity Matrix as a Heatmap with Clustering

Description

This function creates a heatmap from a square distance or similarity matrix. If a similarity matrix is provided, it should first be converted to a distance matrix by the user. The function supports hierarchical clustering, group annotations, row/column sampling (random or stratified), and various customization options.

Usage

plot_heatmap(
  dist_mat,
  max_n = 50,
  group = NULL,
  stratified_sampling = FALSE,
  main_title = NULL,
  palette = "YlOrRd",
  clustering_method = "complete",
  cluster_rows = TRUE,
  cluster_cols = TRUE,
  fontsize_row = 10,
  fontsize_col = 10,
  show_rownames = TRUE,
  show_colnames = TRUE,
  border_color = "grey60",
  annotation_legend = TRUE,
  seed = 123
)

Arguments

dist_mat

A square distance matrix (numeric matrix) or a dist object.

max_n

Integer. Maximum number of observations (rows/columns) to display. If the matrix exceeds this size, a subset of max_n observations is selected.

group

Optional vector or factor providing group labels for rows/columns, used for color annotation.

stratified_sampling

Logical. If TRUE and group is provided, sampling is stratified by group. Each group will contribute at least one observation if possible. Default is FALSE.

main_title

Optional character string specifying the main title of the heatmap.

palette

Character string specifying the RColorBrewer palette for heatmap cells. Default is "YlOrRd".

clustering_method

Character string specifying the hierarchical clustering method, as accepted by hclust (e.g., "complete", "average", "ward.D2").

cluster_rows

Logical, whether to perform hierarchical clustering on rows. Default is TRUE.

cluster_cols

Logical, whether to perform hierarchical clustering on columns. Default is TRUE.

fontsize_row

Integer specifying the font size of row labels. Default is 10.

fontsize_col

Integer specifying the font size of column labels. Default is 10.

show_rownames

Logical, whether to display row names. Default is TRUE.

show_colnames

Logical, whether to display column names. Default is TRUE.

border_color

Color of the cell borders in the heatmap. Default is "grey60".

annotation_legend

Logical, whether to display the legend for group annotations. Default is TRUE.

seed

Integer. Random seed used when sampling rows/columns if max_n is smaller than total observations. Default is 123.

Details

The function works as follows:

  • Converts dist objects to matrices automatically.

  • Samples rows/columns if the matrix is larger than max_n. Sampling can be random or stratified by group.

  • In stratified sampling mode, each group contributes at least one observation if possible.

  • Supports row annotations for groups and automatically assigns colors.

  • Uses pheatmap for plotting with customizable clustering, labels, fonts, and colors.

This function is used internally by visualize_distances() but can be called directly for advanced usage.

Value

Invisibly returns the pheatmap object, allowing further customization if assigned.

See Also

hclust for hierarchical clustering methods. pheatmap for additional heatmap customization options. brewer.pal for available color palettes.

Examples

# Example: Euclidean distance heatmap on iris
eucli_dist <- stats::dist(iris[, 1:4])
dbrobust::plot_heatmap(
  dist_mat = eucli_dist,
  max_n = 10,
  group = iris$Species,
  stratified_sampling = TRUE,
  main_title = "Euclidean Distance Heatmap",
  palette = "YlOrRd",
  clustering_method = "complete"
)

# Example: GGower distances with small subset
data("Data_HC_contamination", package = "dbrobust")
Data_small <- Data_HC_contamination[1:50, ]
cont_vars <- c("V1", "V2", "V3", "V4")
cat_vars  <- c("V5", "V6", "V7")
bin_vars  <- c("V8", "V9")
w <- Data_small$w_loop
dist_sq_ggower <- dbrobust::robust_distances(
  data = Data_small,
  cont_vars = cont_vars,
  bin_vars  = bin_vars,
  cat_vars  = cat_vars,
  w = w,
  alpha = 0.10,
  method = "ggower"
)
group_vec <- rep("Normal", nrow(dist_sq_ggower))
group_vec[attr(dist_sq_ggower, "outlier_idx")] <- "Outlier"
group_factor <- factor(group_vec, levels = c("Normal", "Outlier"))
dbrobust::plot_heatmap(
  dist_mat = sqrt(dist_sq_ggower),
  max_n = 20,
  group = group_factor,
  main_title = "GGower Heatmap with Outliers",
  palette = "YlOrRd",
  clustering_method = "complete",
  annotation_legend = TRUE,
  stratified_sampling = TRUE,
  seed = 123
)


dbrobust documentation built on Nov. 5, 2025, 6:24 p.m.