knitr::opts_chunk$set(echo = TRUE)

Setup

set.seed(1)

pkgs <- c(
  "ComplexHeatmap",
  "magrittr",
  "tidyr",
  "tibble",
  "testthat",
  "mockery",
  "functionalheatmap"
)

for (pkg in pkgs) {
  suppressPackageStartupMessages(
    library(pkg, character.only = TRUE)
  )
}

Introduction

This document is mostly a recollection aid written while working on a package called functionalheatmap.

While writing that package, some tests were required that were somewhat out of my comfort zone. They involved redefining an R function at test-time using the mockery package. I'll explain why it was necessary to do this after a brief intro to my unfinished (and, some might say, superfluous) package.

Basic use of functionalheatmap

I've been writing a wrapper around the ComplexHeatmap package that allows the production of heatmaps using incremental steps and requires less housekeeping than does the, otherwise superb, ComplexHeatmap package.

Here's some example data for use in a heatmap:

body_data <- matrix(
  rnorm(20),
  nrow = 5, dimnames = list(letters[1:5], LETTERS[1:4])
)

row_data <- data.frame(
  foo = c(FALSE, TRUE, FALSE, FALSE, TRUE),
  bar = 1:5,
  row.names = letters[1:5]
)
as.data.frame(body_data)
row_data

Given the above dataset, to produce a heatmap in ComplexHeatmap that had a set of row-annotations (my usual use-case) you'd do something like the following.

my_heatmap <- ComplexHeatmap::Heatmap(
  body_data,
  cluster_columns = FALSE
)

my_annots <- ComplexHeatmap::HeatmapAnnotation(
  row_data,
  which = "row", show_annotation_name = TRUE
)
my_heatmap + my_annots

A given row of your annotations has to match the corresponding row of the heatmap's body.

The package functionalheatmap provides some wrapper functions to ensure that the rows of the body and annotations match-up. This is helpful if you want to make multiple related heatmaps based on different subsets of the main body-data matrix or subsets of the annotations.

To plot a sub-heatmap for just those features where foo is false in ComplexHeatmap you would do the following:

# housekeeping variable
keep_rows <- which(!row_data$foo)

new_heatmap <- ComplexHeatmap::Heatmap(
  body_data[keep_rows, ],
  cluster_columns = FALSE
)

new_annots <- ComplexHeatmap::HeatmapAnnotation(
  row_data[keep_rows, ],
  which = "row", show_annotation_name = TRUE
)

new_heatmap + new_annots

In functionalheatmap the whole dataset is passed into a pipeline. You can filter the contents of either the heatmap body or the row-data and it will match everything up fine.

The input data looks slightly different than the matrix that is passed to ComplexHeatmap. We make tidy versions of the input data so that tidyverse verbs (filter / select etc) can be applied to them.

tidy_body_df <- body_data %>%
  tibble::as_tibble(rownames = "feature_id") %>%
  tidyr::gather(key = "sample_id", value = "fitted_value", -feature_id)

tidy_row_df <- tibble::rownames_to_column(row_data, "feature_id")
# we've pulled the matrix indices into a data-frame
head(tidy_body_df)

To plot the original heatmap:

list(
  body_data = tidy_body_df,
  row_data = tidy_row_df
) %>%
  setup_heatmap() %>%
  format_heatmap(
    cluster_columns = FALSE
  ) %>%
  annotate_heatmap(
    row_annotations = c("foo", "bar"),
    row_dots = list(show_annotation_name = TRUE)
  ) %>%
  plot_heatmap()

... and to plot the not-foos, do this:

list(
  body_data = tidy_body_df,
  row_data = dplyr::filter(tidy_row_df, !foo)
) %>%
  setup_heatmap() %>%
  format_heatmap(
    cluster_columns = FALSE
  ) %>%
  annotate_heatmap(
    row_annotations = c("foo", "bar"),
    row_dots = list(show_annotation_name = TRUE)
  ) %>%
  plot_heatmap()

Or indeed, write a pipeline to return the heatmap data-structure:

heatmap_pipeline <- function(x) {
  setup_heatmap(x) %>%
    format_heatmap(
      cluster_columns = FALSE
    ) %>%
    annotate_heatmap(
      row_annotations = c("foo", "bar"),
      row_dots = list(show_annotation_name = TRUE)
    )
}

... and then pump your easily filtered dataset into it

list(
  body_data = tidy_body_df,
  row_data = dplyr::filter(tidy_row_df, !foo)
) %>%
  heatmap_pipeline() %>%
  plot_heatmap()

Using polyply you can get away with having just a single heatmap dataset. I won't go into how that works, but the above code would look like:

# -- not ran -- #

# globally-useful heatmap dataset:
library(polyply)
heat_frames <- poly_frame(
  body_data = tidy_body_df,
  row_data = tidy_row_df
)

heat_frames %>%
  # filter based on `foo` in `row_data`
  activate(row_data) %>% filter(!foo) %>%
  # run the functionalheatmap pipeline and plot
  heatmap_pipeline() %>%
  plot_heatmap()

In short, functionalheatmap allows me to abstract away the functions involved in making and plotting a heatmap using ComplexHeatmap. This has helped deduplicate and de-temp-variable-enate some of my analysis scripts considerably.

Use of mockery::mock() in functionalheatmap tests

An aim of functionalheatmap was to be able to incrementally change the formatting of a heatmap[^1]. For example, if you had already passed some data through the heatmap_pipeline described above, but on calling plot_heatmap you decided the title needed changing, or the body-colours were all wrong you should be able to do this:

# -- not ran -- #
heatmap1 <- blep %>% blep %>% blep %>% heatmap_pipeline()

plot_heatmap(heatmap1)

# YUCK!

heatmap2 <- format_heatmap(heatmap1, na_col = "purple", some_other_formats ...)

plot_heatmap(heatmap2)

# LESS YUCK!

... and the formatting you provide should add to or supplant the existing formatting arguments.

All these formatting flags are stored in the data-structure that is passed around by the functions in functionalheatmap. We still use the Heatmap plotting code from ComplexHeatmap - we only pass Heatmap() the formatting flags in the final plot_heatmap step.

So you can set the formatting flags in one function and they exert their effect when used in another function.

At one point while writing this package I was unsure how to go about testing that formatting-flags that were set in format_heatmap were doing what they should. I don't even know how to unit (or otherwise) test the output of a plotting function. So all I wanted to know was that when a formatting flag is set in the format_heatmap call, that flag is set in the subsequent call to Heatmap() made by plot_heatmap.

So I found out about the package mockery.

The idea is pretty cool. I'd heard of mocks and stubs in passing but had never used them before. Basically you create a 'mock-object' that behaves in some way like another object (which we'll name the 'model' cos I was a genetics student once) and inject that mock-object into your code. Suppose your doing this within a unit-test, then at that point where the test-code should use the model-object you actually call the mock-object.

In R, functions are objects, so we can create mock functions. All those filthy side-effecting functions that touch databases or plot figures or modify stuff out-of-environment or write to files: they could be replaced with mock functions in your test code.

In the use-case described above:

Evidently, the 'filthy side-effecting' function here is Heatmap(). So we will make a mock Heatmap function to help test this use-case.

Using mockery to mock a function

We use mock to define a mock function:

mock_fn <- mock(2)

As set-up, mock_fn can be used once (since there's only one argument and we haven't played with it's cycle argument (not discussed)) and when it is called it will return the value 2.

Then we use testthat::with_mock to both

Here we want to replace the model-function Heatmap and we want to check what happens when Heatmap() is called within plot_heatmap. The call to with_mock should look like this:

# -- not ran -- #
testthat::with_mock(
  # replace the model-function `Heatmap` with the mock-function `mock_fn`
  Heatmap = mock_fn, {
    # define an expression wherein the model-function would normally be called
    plot_heatmap(some_input_data)
  },
  # you want to do the replacement in the ComplexHeatmap namespace
  .env = "ComplexHeatmap"
)

Let's make some input data so we can check everything works as expected:

my_heatmap_data <- list(body_data = tidy_body_df) %>%
  setup_heatmap() %>%
  format_heatmap(na_col = "purple")

If everything works, the na_col parameter in Heatmap() should be set to purple.

testthat::with_mock(
  Heatmap = mock_fn, {
    plot_heatmap(my_heatmap_data)
  },
  .env = "ComplexHeatmap"
)

The value 2 is returned because Heatmap() is the last thing evaluated in plot_heatmap(), and our mock-function was set-up to return 2.

So what do we test?

The mock-function we set up actually stores some information about the function call and the argument values used when it was called in place of Heatmap().

mock_args(mock_fn)
mock_calls(mock_fn)

So in an automated test, we can check that na_col was set to purple as follows:

heatmap_args <- mock_args(mock_fn)
testthat::expect_true(
  "na_col" %in% names(heatmap_args[[1]]) &&
    heatmap_args[[1]]$na_col == "orange"
)

Ack! That's what happens when your test fails. Sorry, na_col was supposed to be purple:

testthat::expect_true(
  "na_col" %in% names(heatmap_args[[1]]) &&
    heatmap_args[[1]]$na_col == "purple"
)

Silence is golden.

There's some neater expect_* functions for working with mocks within mockery (expect_called, expect_call, expect_args). They didn't quite hit the mark for the test I needed though.

So I found mockery really helpful. And now, thinking back, I feel pretty bad about all those tests I once wrote that hit the EBI database servers .....

[^1]: still working on that



russHyde/functionalheatmap documentation built on July 9, 2019, 10:30 p.m.