In cbsteen/clustermap: clustermap: a Versatile Toolbox in R for Clustering and Heatmap Visualisation

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

Clustermap is a collection of R functions for hierarchical clustering, determination of the number of clusters, and heatmap visualisation. The package produces high-quality annotated heatmaps, makes the user determine the visual design, and has a user-friendly interface. Even users with little R experience will be able to produce pretty figures at a fast pace, while more experienced users will appreciate the modular design that allows the package to be extended with new functionality.

Clustermap is designed with the philosophy that technical parameters and quantities calculated from the data should be hidden from sight whenever possible, while still being available to the advanced user. All technical parameters have default values that works well for most applications, but they can be changed easily when desired. Full access to such parameters and the modular design allows advanced users to integrate their own code with Clustermap.

Clustermap offers a variety of solutions and options through a simple, intuitive interface. Colors in heatmaps and annotation color bars are easily changed, requiring no prior knowledge of color handling in R. Heatmaps may be clustered row-wise, column-wise or both, and the user determines which of the cluster dendrograms should be plotted and where to plot them relative to the heatmap. The number of clusters may be found automatically with one of the two provided algorithms Gap or PART, or may be set manually by the user. Diagnostic plots such as silhouette plots and Gap plots are fully integrated in the package and can be produced in an instant to guide the manual choice of the number of clusters.

This manual will present some general features of the package clustermap and provide a more in-depth description of the functions implemented in the package. Finally, a number of examples of increasing sophistication and complexity will guide the user through many of the features of the package. These examples are designed to contain only the essential code and can be used as program patterns to be copied and pasted into one's own code.

1. General notes

Installation

To install clustermap from GitHub, you need the package devtools. Once you have installed devtools, you can install clustermap using the following code.

library(devtools)
install_github("cbsteen/clustermap")

Input format

Clustermap accepts any numerical matrix as input for the heatmap, and any vector of numerical values or character values for annotation bars. Missing values (NA's) are accepted in both cases and will be shown in a user-defined color (default is grey for both heatmap and annotation bars).

Output format

The main output from Clustermap is a plot depicting a clustered heatmap with or without annotation. Additional output may include diagnostic plots (e.g. to assess then number of clusters) as well as tables providing information about the cluster identify of individual rows and columns. By default, the output is shown in a graphics window on the screen; however, the output is easily re-directed to file by encapsulating the code with appropriate statements as shown below. Try help(Devices) in R to read more about graphics device options; note that the options available may depend on your computing platform and your version of R.

pdf("Figure.pdf") 
<code to plot a clustered heatmap>
dev.off()

The structure of a Clustermap plot

The main plot produced by Clustermap has the following general structure:

Core region: this is where the heatmap is shown
Inner margin: surrounds the heatmap on all sides and this is where the annotation bars are shown, providing additional information about rows or columns
Outer margin: surrounds the inner margin on all sides and this is where cluster trees, row/column labels and the legend for the annotation bars are shown

The size of the inner and outer margins is automatically set by Clustermap during the initialization, where the user declares what type of information the plot will contain. For example, if the user wants to show a heatmap with no additional information (not even cluster trees), then inner and outer margins will be set to (essentially) zero. If the user wants to show a heatmap with cluster trees on the left hand side of the heatmap, then the left outer margin will be set to 0.3 (meaning 30% of the width of the core region). Likewise, if the user wants to show a heatmap with annotation bars above the heatmap, then the top inner margin will be set to 0.1 (meaning 10% of the height of the core region). Margin sizes defined by Clustermap are not always appropriate for the problem at hand. For example, the user may want to show many annotation bars above the heatmap, in which case the default top inner margin could be too narrow. During the initialization of the plot, the user may add or subtract space from inner and outer margins using the arguments inner and outer. For example, the argument inner=c(0, 0, 0.3, 0) would extend the top inner margin by 30% of the height of the core region. To see what the current inner and outer margins are set to, you can look at the system variable .CLUSTERMAP$margin. Note that system variables are not defined before you have performed plot initialization.

The structure of the code

Clustermap plots are constructed by supplying a sequence of commands, each performing a particular computational task or adding a particular feature to the draw. This approach offers an alternative to "all-in-one" solutions performing most of the computations and the graphical rendering by a single call to a function with a huge number of arguments. The Clustermap approach allows the user to control (and if desired save results from) every step of the process, and individual elements can be left out or included as desired. In order to keep the code simple and intuitive, most of the parameters and variables that need to be transferred between function calls are stored in a global environment variable .CLUSTERMAP and are hidden to the user.

A certain ordering of the function calls must be respected in order to ensure that all required parameters are defined prior to a given function call. For example, a call to draw.tree() to plot a cluster dendrogram can only be done after a call to hcluster() to compute the clustering. There are multiple advantages of separating the clustering computation and the plotting of the cluster, some of which will be illustrated in the examples later.

As a general guideline, the functions in clustermap should be called in the following order:

1. draw.init                  # Always required
2. hcluster                   # If clustering is required
3. subclust                 # If subcluster identification is required
4. draw.hmap                  # If heatmap is required
5. draw.tree                  # If annotation is required
   draw.text
   set.color
   draw.cbar        

6. draw.cbar.key            # If a legend for the color bars is required
7. draw.hmap.key              # If a color map for the heatmap is required  
8. draw.silhouette        # Diagnostic plots
   draw.mean.silhouette
   get.silhouette
   draw.gap

The function draw.init erases any memory of past calls made to any of the functions in the package, so you get a fresh start. The function hcluster is used to cluster either columns or rows and you specify what type of clustering you would like (linkage, distance, etc). The function subclust may or may not be required in your analysis; what it does is to determine how many distinct clusters are actually present in the preceding clustering. And then comes a series of plot functions to show the heatmap, any color bars you may want to show on the left/right or bottom/top of the heatmap, the cluster trees, a color key (draw.hmap.key) that explains what values the heatmap colors represent, etc.

Dependencies

The packages graphics and gclus are utilized by clustermap. If these are not already installed on your computer, give the following command to R:

install.packages("graphics")
install.packages("gclus")

or use the Package Installer inside R, found under the menu item ‘Packages & Data’.

Colors

To change the colors used in the heatmap, provide your preferred color scale as an argument to draw.hmap. For example, the call

draw.hmap(X, colorscale="green-white-red")

plots a heatmap for the input data matrix X, using a color scale ranging from green (for negative values) via white (for value 0) to red (for positive values). The highest green intensity is achieved for values in X that are equal to all lower than -xmax, while the highest red intensity is achieved for values in X that are equal to or higher than xmax, where xmax is a positive constant. The default value of xmax is max(abs(X)). To overrule the default value, use the argument xmax in draw.hmap as in this example:

draw.hmap(X, colorscale="yellow-white-blue", xmax=1)

For the color specification, you may provide any three color names supported by R; here are a few examples:

blue-white-red
green-black-red
lightgrey-grey-darkgrey
magenta-cyan-pink

To obtain a textual overview of all colors available, give the command colors() or to see a visual overview over colors available, see http://www.stat.columbia.edu/~tzheng/files/Rcolor.pdf.

Reference

Clustermap: an extensible R package for clustering and generating annotated heatmaps. Ole Christian Lingjærde*, Chloé Beate Steen*, Miriam Ragle Aure, Vilde Drageset Haakensen (submitted)

* Corresponding authors, contributed equally to this work.

2. Functions in Clustermap

Summary of `clustermap` functions

`draw.init Call this first to set up correct margins etc

draw.hmap Plot heatmap using your preferred colors

draw.cbar Add color bar to one of the sides (to add extra info)

draw.tree Add column or row dendrogram to a heatmap

hcluster Perform hierarchical clustering of rows or columns

subclust Guided or automatic identification of subclusters

get.subclust Get subtype labels (for original row/col ordering of X)

set.color Set color coding for accompanying data

draw.hmap.key Plot color key for heatmap

draw.cbar.key Plot color key for color bar (annotation bars)

draw.text Add text to one of the sides (for row/column labels)

draw.silhouette Silhouette plot

draw.mean.silhouette Mean silhouette score plot

draw.gap Gap score plot

blocks Generate an example data set with 110 rows and 40 columns

draw.init()

draw.init is used to initiate a draw.

The purpose is to make room on the four sides of the heatmap for additional elements such as cluster dendrograms, color bars with additional info, and labels. Supply as arguments the sides where you want such elements, with the coding

side=1 below the heatmap side=2 left side of the heatmap side=3 above the heatmap side=4 right side of the heatmap

If you want a tree (=dendrogram) on the left and top, let tree=c(2,3). If you want text labels on the right, let text=4. If you want color bars below the plot, let cbar=1.

Examples:

draw.init()                # Dendrogram on the left and on the top

draw.init(tree=c())       # Make no room for additional elements

draw.init(tree=3, text=4)  # Dendrogram on the top, labels on the right

The last two arguments of draw.init (inner and outer) do not overrule the other parameters, but allows the user to add (or even subtract) a specific amount of space in the inner and outer margins on a specified side of the heatmap. They may be used alone, or in combination with other parameters. They should always be vectors of length 4 giving the desired margin extension for each of the four sides. An example of use would be to extend an inner margin by a certain amount because we plan to show many bar plots of accompanying data there that we want to give some space. Note that inner=c(0.4,0.4,0.4,0.4) means that each inner margin should be extended by a length corresponding to 0.4 times the total length (in that direction) of the heatmap itself.

subclust()

Use the function subclust to identify subclusters in a dendrogram. The effect is that later calls to draw.tree will show the subclusters in different colors. You can either provide the desired number of clusters using the argument k, or you can set k=NA to automatically estimate the number of cluster with one of two algorithms: gap (Tibshirani et al (2001), J Roy Statist Soc B, 63: 411-423) or PART (Nilsen et al (2012), Stat Appl Genet Mol Biol, 12: 637-652).

Using gap (Tibshirani et al., 2001), the result is an estimate of the optimal 'flat cut' of the dendrogram, while using PART (Nilsen et al., 2012), the result is an estimate of the optimal 'nonflat cut' of the dendrogram. In essence the PART algorithm is an extension of gap that applies gap recursively on subclusters previously identified. B and min.size are used by PART (only if k=NA) and determines the number of permutations (B=25 is low, but computation time increases with increasing B; consider B=200 or higher for high-quality estimates) and the least number of elements in a single cluster (min.size=5 is quite arbitrary and you may want to change this). Note that PART may call some individuals as outliers and these will be shown in a separate color but may not be close to each other and will stick out as "clusters" of size potentially smaller than min.size.

get.subclust()

subclust retrieves a vector of cluster labels after identification of subclusters. By default, labels are returned in the same order as shown in the cluster tree (order="tree"). To obtain labels in the original ordering in the input data X, use order="orig".

By default, the function returns a vector of cluster labels. If you supply row/column labels using the argument labels the function returns a data frame with two columns, the first being the cluster labels and the second being the supplied row/column labels.

draw.hmap()

draw.hmap() plota heatmaps of a numerical data matrix X. It is recommended that you print to a file (e.g. a pdf-file) instead to increase speed (and use fast=F). Use the argument colorscale to control how numerical values are represented as colors in the heatmap. The argument should be three (valid R) colors separated by hyphens. Some examples: "green-black-red", "yellow-black-blue", "black-white-black" (to ignore signs), and "white-grey-darkgrey".

draw.text()

draw.text() is used to add labels to the sides of the heatmap. The first argument should be a vector of the same length as the number of columns in the heatmap (when side=1 or side=3), and a vector of the same length as the number of rows in the heatmap (when side=2 or side=4). The size of the text can be changed with the cex argument. Text labels can be truncated to a certain maximal length (for example 4 characters) with the maxchar argument.

3. Examples using `clustermap`

Set up everything

The following code sets up a simple example data set that can be used to run the examples below. It creates a hypothetical gene expression matrix consisting of 110 genes as rows, and 40 patient samples as columns. The y1 vector contains information about cancer subtype of each sample. The y2 vector contains information about stage, and finally the y3 vector about immune score.

library(clustermap) # Make package functions available
tmp = blocks() # Contains the 110x40-matrix X and 40-vectors y1,y2,y3
X = tmp$X
y1 = tmp$y1 
y2 = tmp$y2
y3 = tmp$y3

How to create a heatmap

The following code creates a simple heatmap with no clustering, using the default color-scale ("blue-white-red") and adds the key for the color scale.

draw.init()
draw.hmap(X)
draw.hmap.key()

Heatmap and clustering of columns

The following example shows how to cluster columns in a heatmap using the hcluster() function, and how to add a tree (dendogram) resulting from the column-wise clustering on the top side of the heatmap (side=3).

draw.init(tree=3)
hcluster(X, clust="col", distance="pearson", linkage="complete")
draw.hmap(X) 
draw.tree(side=3)
draw.hmap.key()
frame()

Heatmap and clustering of rows

This example is the same as the previous one, except that this time the clustering is performed row-wise, and the tree is added on the left side of the heatmap (side=2).

draw.init(tree=2)
hcluster(X, clust="row", distance="euclidean", linkage="complete")
draw.hmap(X)
draw.tree(side=2)
draw.hmap.key()

Heatmap and clustering of rows and columns

Similar example as above, where the clustering is performed both row-wise and column-wise, and trees are added on the left side and on the top of the heatmap.

draw.init(tree=c(2,3))
hcluster(X, clust="row", distance="euclidean", linkage="complete")
hcluster(X, clust="col", distance="euclidean", linkage="complete")
draw.hmap(X)
draw.tree(side=2)
draw.tree(side=3)
draw.hmap.key()

Heatmap, clustering of rows and columns, and colorbar below heatmap

The following example adds a color bar at the bottom of heatmap, with a discrete color scales made up of 5 colors, using the y1 vector. The color bar is labeled "Subtype".

draw.init(tree=c(2,3), cbar=1, ckey=T, legend=T, text=4)
hcluster(X, clust="row", distance="euclidean", linkage="complete")
hcluster(X, clust="col", distance="euclidean", linkage="complete")
draw.hmap(X)
draw.tree(side=2)
draw.tree(side=3)
z1 = set.color(y1, label="Subtype", type="discrete", color=c("red", "pink", "darkblue", "lightblue", "green"))
draw.cbar(z1, side=1)
draw.cbar.key()
draw.hmap.key()

The legend is large compared to the heatmap, so we can make it smaller by increasing the distance between the upper part of the heatmap and the upper part of the key, by setting the vsize argument to 0.3.

draw.init(tree=c(2,3), cbar=1, ckey=T, legend=T, text=4)
hcluster(X, clust="row", distance="euclidean", linkage="complete")
hcluster(X, clust="col", distance="euclidean", linkage="complete")
draw.hmap(X)
draw.tree(side=2)
draw.tree(side=3)
z1 = set.color(y1, label="Subtype", type="discrete", color=c("red", "pink", "darkblue", "lightblue", "green"))
draw.cbar(z1, side=1)
draw.cbar.key(vusep=0.5)
draw.hmap.key()

How to add multiple color bars below a heatmap

The following example shows how to add multiple color bars below a heatmap, to provide information about cancer subtype (z1), stage (z2) and immune score (z3). This is done by providing multiple set.color objects, here called z1, z2 and z3 to the draw.cbar function.

draw.init(tree=c(2,3), cbar=1, legend=T, text=4)
hcluster(X, clust="row", distance="euclidean", linkage="complete")
hcluster(X, clust="col", distance="euclidean", linkage="complete")
draw.hmap(X)
draw.tree(side=2)
draw.tree(side=3)
z1 = set.color(y1, label="Subtype", type="discrete", color=c("red", "pink", "darkblue", "lightblue", "green"))
z2 = set.color(y2, label="Stage", type="discrete", color=rainbow(8))
z3 = set.color(y3, label="Immune score", type="continuous", color="white-blue")
draw.cbar(z1, z2, z3, border="black", side=1)
draw.cbar.key()
draw.hmap.key()

Add row labels to a heatmap

The following code adds the rownames of the dataset X as row labels to the heatmap, using the function draw.text. The size the text is set using the argument cex, and the placement of the text using the argument side, where here 4 refers to the right side of the heatmap.

xsub = X[sample(40),]
draw.init(tree=c(2,3), cbar=3, legend=T, text=4)
hcluster(xsub, clust="row", distance="euclidean", linkage="complete")
hcluster(xsub, clust="col", distance="euclidean", linkage="complete")
draw.hmap(xsub)
draw.tree(side=2)
draw.tree(side=3)
z1 = set.color(y1, label="Subtype", type="discrete", color=c("red", "pink", "darkblue", "lightblue", "green"))
z2 = set.color(y2, label="Stage", type="discrete", color=palette())
z3 = set.color(y3, label="Immune score", type="continuous", color="black-yellow")
draw.cbar(z1, z2, z3, border="black", side=3)
draw.text(rownames(xsub), cex=0.7, side=4)
draw.cbar.key(hsep=0.15)
draw.hmap.key()

Add columns labels to a heatmap

Similarly to the example above, labels can be added to columns using the draw.text() function. The size the text is set using the argument cex, and the placement of the text using the argument side, where here 3 refers to the top of the heatmap.

pdf("example_1.pdf")
draw.init(tree=c(2,3), cbar=3, text=c(1,4))
hcluster(X, clust="row", distance="euclidean", linkage="complete")
hcluster(X, clust="col", distance="euclidean", linkage="complete")
draw.hmap(X)
draw.tree(side=2)
draw.tree(side=3)
z1 = set.color(y1, label="Subtype", type="discrete", color=c("red", "pink", "darkblue", "lightblue", "green"))
z2 = set.color(y2, label="Stage", type="discrete", color=palette())
z3 = set.color(y3, label="Immune score", type="continuous", color="black-yellow")
draw.cbar(z1, z2, z3, border="black", side=3)
draw.text(colnames(X), cex=0.7, side=1)
draw.hmap.key()
dev.off()

Heatmap, clustering of rows and columns, define subclusters

The following example illustrates how to cut a dendrogram into k=3 clusters and show in draw. The function subclust() is used to identify 3 clusters column-wise (clust="col"). To highlight the new tree with the colored clusters, the lwd argument in draw.tree is set to 2 to make the branches thicker.

draw.init(tree=c(2,3))
hcluster(X, clust="row", distance="euclidean", linkage="complete")
hcluster(X, clust="col", distance="euclidean", linkage="complete")
subclust(3, clust="col")
draw.hmap(X)
draw.tree(side=2)
draw.tree(side=3, lwd=2)
draw.hmap.key()

Heatmap, clustering of rows and columns, estimate subclusters

The example above used a set number of clusters (k=3). But the number of clusters can also be automatically estimated using for example the GAP algorithm, which is default in the subclust() function. The other option is to use the PART algorithm, not shown here. The estimated clusters can then be visualized. Both row-wise and column-wise subclustering is shown.

draw.init(tree=c(2,3))
hcluster(X, clust="row", distance="euclidean", linkage="complete")
hcluster(X, clust="col", distance="euclidean", linkage="complete")
subclust(NA, clust="col")
subclust(NA, clust="row") 
draw.hmap(X)
draw.tree(side=2)
draw.tree(side=3)
draw.hmap.key()

Heatmap, clustering of rows and columns, estimate subclusters

The previous example estimated the number of subclusters with the default algorithm gap. Here an example is shown where the number of subclusters is determined with the PART algorithm, set by the method argument of the subclust function. Both row-wise and column-wise subclustering is shown.

draw.init(tree=c(2,3))
hcluster(X, clust="row", distance="euclidean", linkage="complete")
hcluster(X, clust="col", distance="euclidean", linkage="complete")
subclust(NA, clust="col", method="part")
subclust(NA, clust="row", method="part")
draw.hmap(X)
draw.tree(side=2)
draw.tree(side=3)
draw.hmap.key()

Heatmap, clustering of rows and columns, estimate subclusters, perform tests

After estimating subclusters, the user can test the association between additional info parameters and subclusters. This is done using the draw.cbar() function, which adds additional information to heatmap, and by specifying the pvalue.method, here "fisher", a p-value showing the association between the subclusters and the color bar added is displayed on the heatmap.

draw.init(tree=c(2,3), cbar=3, legend= T, text=4)
hcluster(X, clust="col", distance="euclidean", linkage="complete")
hcluster(X, clust="row", distance="euclidean", linkage="complete")
subclust(NA, clust="col", method="part")
subclust(NA, clust="row", method="part")
draw.hmap(X)
z1 = set.color(y1, label="Subtype", type="discrete", color=c("red", "pink", "darkblue", "lightblue", "green"))
z2 = set.color(y2, label="Stage", type="discrete", color=palette())
z3 = set.color(y3, label="Immune score", type="continuous", color="black-yellow")
draw.cbar(z1, z2, z3, pvalue=T, pvalue.method="fisher", border="black", side=3)
draw.tree(side=2)
draw.tree(side=3)
draw.cbar.key()
draw.hmap.key()

Silhouette plot

Silhouette plots are used to assess the quality of the subclustering performed on the data. A silhouette plot displays how close each point in a cluster is to points in the closest neighboring clusters. Using the same clustering as in Example 12, we show the silhouette plot of the column-wise clustering. Three clusters seem reasonable in this case.

draw.init(cbar=3, text=4, legend=T)
hcluster(X, clust="col", distance="euclidean", linkage="complete")
hcluster(X, clust="row", distance="euclidean", linkage="complete")
subclust(NA, clust="col", method="part")
subclust(NA, clust="row", method="part")
draw.silhouette()

Gap statistics

The Gap statistics is a measure of clustering quality. Using the draw.gap() function, the user can plot the Gap statistics and identify the optimal number of clusters. The optimal number of clusters is highlighted in red.

pdf("Example14.pdf")
draw.init(cbar=3, text=4, legend=T)
hcluster(X, clust="col", distance="euclidean", linkage="complete")
hcluster(X, clust="row", distance="euclidean", linkage="complete")
subclust(NA, clust="col", method="part")
subclust(NA, clust="row", method="part")
draw.gap()
dev.off()

Clustered heatmap where the clustering is derived from another data set

It may be useful to apply the clustering estimated using one dataset, to another dataset of the same size. For instance, one may wish to follow patient samples over time, using the original subgrouping identified in the first samples. This is done by plotting a heatmap using another dataset with the draw.hmap() function, right after having performed the clustering on the original dataset with hcluster().

draw.init(tree=c(2,3))
hcluster(X, clust="row", distance="euclidean", linkage="complete")
hcluster(X, clust="col", distance="euclidean", linkage="complete")
Xnew = tanh(X) # Xnew can be any numerical matrix of same size as X
draw.hmap(Xnew)
draw.tree(side=2)
draw.tree(side=3)
draw.hmap.key()

Show three clustered heatmaps on the same page

Multiple heatmaps can be displayed on the same page using the frame() function.

xmax = max(2*X, na.rm=T)
par(mfrow=c(2,2))
panel.init(mfrow=c(2,2))
draw.init(tree=c(2,3), ckey=F, outer=c(0.1, 0.1, 0.1, 0.1))
hcluster(X, clust="row", distance="euclidean", linkage="complete")
hcluster(X, clust="col", distance="euclidean", linkage="complete")
draw.hmap(X, xmax=xmax)
draw.tree(side=2)
draw.tree(side=3)
draw.hmap.key()
frame()
draw.init(tree=c(2,3), ckey=F, outer=c(0.1, 0.1, 0.1, 0.1))
hcluster(X*2, clust="row", distance="pearson", linkage="complete")
hcluster(X*2, clust="col", distance="pearson", linkage="complete")
draw.hmap(X*2, xmax=xmax)
draw.tree(side=2)
draw.tree(side=3)
draw.hmap.key()
frame()
draw.init(tree=c(2,3), ckey=F, outer=c(0.1, 0.1, 0.1, 0.1))
hcluster(X/2, clust="row", distance="euclidean", linkage="average")
hcluster(X/2, clust="col", distance="euclidean", linkage="average")
draw.hmap(X/2, xmax=xmax)
draw.tree(side=2)
draw.tree(side=3)
draw.hmap.key()
frame()

Cluster color image

Clustering is not restricted to gene expression data. Here, a more complex matrix from an image is clustered based on (R,G,B) values.

library(magick)
pict = bellagio()
panel.init(mfrow=c(1,3))
draw.init(ckey=F)
draw.hmap(pict$X)
frame()
frame()
draw.init(ckey=F)
draw.hmap(pict$Xrc)
frame()
frame()
draw.init(ckey=F)
Xcol = col2num(pict$Xrc, clust="col") # In each column replace color by (R,G,B)
Xrow = col2num(pict$Xrc, clust="row") # In each row replace color by (R,G,B)
hcluster(Xcol, clust="col", linkage="average") # Cluster columns using (R,G,B) values
hcluster(Xrow, clust="row", linkage="average") # Cluster rows using (R,G,B) values
draw.hmap(pict$Xrc, origin=1)
frame()
frame()

Figure in manuscript

xsub = tanh(X[sample(40),])
draw.init(tree=c(2,3), cbar=3, legend=T, text=4)
hcluster(xsub, clust="row", distance="euclidean", linkage="complete")
hcluster(xsub, clust="col", distance="euclidean", linkage="complete")
draw.hmap(xsub)
draw.tree(side=2)
draw.tree(side=3)
z1 = set.color(y1, label="Subtype", type="discrete", color=c("red", "pink", 
                                                             "darkblue", "lightblue",
                                                             "green"))
z2 = set.color(y2, label="Stage", type="discrete", color=palette())
z3 = set.color(y3, label="Immune score", type="continuous", color="black-yellow")
draw.cbar(z1, z2, z3, pvalue=T, pvalue.method="fisher", border="black", side=3)
draw.text(rownames(xsub), cex=0.7, side=4)
draw.cbar.key(hsep=0.15)
draw.hmap.key()

cbsteen/clustermap documentation built on May 24, 2019, 1:35 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

cbsteen/clustermap clustermap: a Versatile Toolbox in R for Clustering and Heatmap Visualisation

In cbsteen/clustermap: clustermap: a Versatile Toolbox in R for Clustering and Heatmap Visualisation

1. General notes

Installation

Input format

Output format

The structure of a Clustermap plot

The structure of the code

Dependencies

Colors

Reference

2. Functions in Clustermap

Summary of clustermap functions

draw.init()

subclust()

get.subclust()

draw.hmap()

draw.text()

3. Examples using clustermap

Set up everything

How to create a heatmap

Heatmap and clustering of columns

Heatmap and clustering of rows

Heatmap and clustering of rows and columns

Heatmap, clustering of rows and columns, and colorbar below heatmap

How to add multiple color bars below a heatmap

Add row labels to a heatmap

Add columns labels to a heatmap

Heatmap, clustering of rows and columns, define subclusters

Heatmap, clustering of rows and columns, estimate subclusters

Heatmap, clustering of rows and columns, estimate subclusters

Heatmap, clustering of rows and columns, estimate subclusters, perform tests

Silhouette plot

Gap statistics

Clustered heatmap where the clustering is derived from another data set

Show three clustered heatmaps on the same page

Cluster color image

Figure in manuscript

R Package Documentation

Browse R Packages

We want your feedback!

cbsteen/clustermap
clustermap: a Versatile Toolbox in R for Clustering and Heatmap Visualisation

Summary of `clustermap` functions

3. Examples using `clustermap`