knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
library(pineplot)
library(grid)
library(gridExtra)
library(magick)

The pineplot package can be used to generate pine plots, which are stacked triangular heat maps that are generated using ggplot2 and grid. Visualizing symmetric matrices in this way has the following benefits:

Example using gene expression data

As an example, we use RNA-seq data from Merkin and Brawand.

data(merkin_brawand_rnaseq)

The data contains gene counts for three species (mouse, chicken, and macaque) and each species has several samples from three different tissues (liver, kidney, and brain). We focus on the counts of 15 genes that have been reported in literature as being tissue specific genes for liver or kidney.

colnames(merkin_brawand_rnaseq) # samples
rownames(merkin_brawand_rnaseq) # genes

First, we split the raw table into 9 matrices. There are 3 matrices for each species, each containing the gene expression data of either liver, kidney, or brain tissue.

Each of these matrices can be transformed into a symmetric matrix by measuring the relationship between each of the genes. We use mutual expression, which is provided in the pineplot package as an example function to obtain a relationship between variables of interest. However, other options such as correlation may be more appropriate depending on the type of data visualized as well as the amount of data available. As an example, we generate the symmetric matrices for each species for liver tissue.

ms = list()
for (tissue in c('brain', 'kidney', 'liver')){
  ms[[tissue]] <- list()
  # create a pineplot for each tissue
  for (species in c('chicken', 'mouse', 'macaque')){
    label <- paste0(species, "-", tissue)
    m <- merkin_brawand_rnaseq[,colnames(merkin_brawand_rnaseq)==label]
    ms[[tissue]][[paste0(species, '-', tissue)]] <- 
      symmetric_matrix_generator(m, 1)
  }
}
max_expression <- max(unlist(ms))
ms <- rapply(ms, function(x){x / max_expression}, how='replace')

Customization

As each heat map is a ggplot2 object, they can easily be annotated or manipulated aethtetically. A user may either use the standard parameters that have been set within the draw_heatmap function, or they may change the objects after they have been generated by adding "+" followed by ggplot2 customization. We use the callback custom_layer.2 to add an annotation layer to each heatmap, highlighting genes specific to liver, colour code the genes axis labels that are specific to either liver (blue) or kidney (black), and add the image of mouse in the axes margins. Building the heatmap in this way gives complete control of the position and style of any image, text, legend etc. Any option available for ggplot2 can be manipulated instead of relying on parameters offered by typical heatmap functions.

library(ggplot2)

custom_layer.2 <- function(ggplt.object, arg){
  tissue <- sub('.*-', '', arg)
  if(tissue == "kidney"){
    area.of.interest <- data.frame(x = c(10.50, 11.50, 15.5, 15.50, 10.5, 10.50 ),
                                   y=c(10.50, 10.50, 14.50, 15.55, 15.55, 10.5))
    area.border.col <- "black"
  } else if(tissue == "liver"){
    area.of.interest <- data.frame(x=c(0.50, 1.50, 10.50,  10.50, 0.50, 0.50 ),
                                   y=c(0.55, 0.55, 9.50, 10.50, 10.55, 0.55))
    area.border.col <- "royalblue1"
  }

  if (exists('area.border.col')){
    ggplt.object <- ggplt.object + geom_polygon(data=area.of.interest, 
                                                aes(x=x, y=y),
                                                colour=area.border.col,
                                                size=.8,
                                                linetype="dashed",
                                                fill=NA)
  }
  return(ggplt.object)
}

Annotation

custom_layer <- function(heatmap, arg){
  species <- sub('-.*', '', arg)
  img <- image_read(paste0('./images/', species, '.png'))
  img <- image_background(img, "none")
  img <- rasterGrob(x=.0, y=.8, width=.2, just=c('left', 'top'), img, interpolate=TRUE)
  img
}
pineplots <- list()
for (i in 1:3){
  tissue <- c('brain', 'kidney', 'liver')[i]
  pp <- generate_pineplot(
    ms[[tissue]],
    customize_fn = custom_layer,
    low = "blue",
    mid = "yellow",
    high = "red",
    midpoint = .5,
    limits=c(0, 1),
    annotation_fn = custom_layer.2,
    legend.scale = 1.2
  )
  pineplots[[i]] <- pp
}

The grid.arrange function provided by gridExtra is used to lay the pine plots out side-by-side for easy comparison.

library(gridExtra)

grid.arrange(pineplots[[1]], pineplots[[2]], pineplots[[3]], nrow=1)


klovens/pineplot documentation built on Nov. 4, 2019, 3:53 p.m.