An introduction to the gapmap package

This document explains basic functions of the gapmap package to draw a gapped cluster heatmap. The plot is generated using the ggplot2 package. Let's load the library first.

library(gapmap)

We will simulate a simple dataset.

set.seed(1234)
x <- rnorm(10, mean=rep(1:5, each=2), sd=0.4)
y <- rnorm(10, mean=rep(c(1,2), each=5), sd=0.4)
dataFrame <- data.frame(x=x, y=y, row.names=c(1:10))
#calculate distance matrix. default is Euclidean distance
distxy <- dist(dataFrame)
#perform hierarchical clustering. default is complete linkage.
hc <- hclust(distxy)
dend <- as.dendrogram(hc)

To make a gapped cluster heatmap, you need to pass a matrix object for heatmap, and dendrogram class objects for drawing dendrograms and ordering.

grey_scale =c("#333333", "#5C5C5C", "#757575", "#8A8A8A", "#9B9B9B", "#AAAAAA", "#B8B8B8", "#C5C5C5", "#D0D0D0", "#DBDBDB", "#E6E6E6")
gapmap(m = as.matrix(distxy), d_row= rev(dend), d_col=dend, col = grey_scale)

The default of gapmap function is in the quantitative mode and uses exponential mapping. First, you can choose either modes: quantitative or threshold.

The following example uses the linear mapping. This mapping generate more gaps, whereas the previous example of exponential mapping emphasize on the large gaps.

gapmap(m = as.matrix(distxy), d_row= rev(dend), d_col=dend,  mode = "quantitative", mapping="linear", col = grey_scale)

The following example illustrate the difference of two mapping schemes. For the exponential mapping, the scale log base is set to 0.5.

distances <-seq(0, 5, 0.1)
data <- data.frame(distance=distances)
s <- 0.5
l <- data
e <- data
for(i in 1:nrow(data)){
  dist <- data$distance[i]
  linear <- map(dist, 0, 5, 0, 1)
  exp <-  map.exp(dist, 0, 5, 0, 1, scale = s)
  #print(paste0("dist =", dist," linear=",linear, " exp=", exp))
  l$gap[i] = linear
  e$gap[i] = exp
  l$type[i] = "linear"
  e$type[i] = "exponential"
}
gaps <- rbind(l, e)
ggplot(gaps, aes(x=gap, y=distance, group=type)) + geom_line(aes(color=type))+theme_bw()+ theme(legend.position= c(0.9,0.1))

The variation of scale log base settings is illustrated in the following plot. The value of scale is annotated on the plot.

scales <- seq(0.1, 3, 0.3)
distances <-seq(0, 5, 0.1)
D = data.frame()
for(j in 1:length(scales)){
  s  <- scales[j]
  data <- data.frame(distance=distances)
  for(i in 1:nrow(data)){
    dist <- data$distance[i]
    exp <-  map.exp(dist, 0, 5, 0, 1, scale = s)
    data$gap[i] = exp
    data$scale[i] = s
  }
  D <- rbind(D, data)
}
labels = data.frame()
for(j in 1:length(scales)){
  a = 0
  b = 5
  c = 0
  d = 1
  y = 0.4 # x position  
  x = scales[j]
  v = a + ((y/(d-c))^x) *(b-a) 
  labels <- rbind(labels, data.frame(scale=x, distance =v, gap=y))
}
ggplot() + geom_line(data=D, aes(x=gap, y=distance, group=scale), color="#56B1F7")+ scale_y_continuous(limits = c(0,5))+
  geom_text(data= labels, aes(x=gap,y=distance, label=scale), hjust=-0.2, vjust=0) +
  geom_point(data= labels, aes(x=gap,y=distance)) +
  theme_bw() + theme(legend.position="none") 

Besides the quantitative mode, there is linear mode to introduce gap by a threshold. In the following example, the dendrograms for rows and columns are cut at the threshold distance of 2 and gaps of the same size are introduced between clusters.

gapmap(m = as.matrix(distxy), d_row= rev(dend), d_col=dend,  mode = "threshold", row_threshold = 2, col_threshold = 2, col = grey_scale)

In addition, this package works well with our dendrogram sorting package, called dendsort. For the details on dendsort, please check our paper.

library(dendsort);
gapmap(m = as.matrix(distxy), d_row= rev(dendsort(dend)), d_col=dendsort(dend),  mode = "quantitative", col = grey_scale)


Try the gapmap package in your browser

Any scripts or data that you put into this service are public.

gapmap documentation built on April 19, 2021, 5:06 p.m.