aheatmap: Annotated Heatmaps
In renozao/NMF: Algorithms and Framework for Nonnegative Matrix Factorization (NMF)

Description Usage Arguments Details PDF graphic devices Row/column ordering and display Author(s) Examples

The function aheatmap plots high-quality heatmaps, with a detailed legend and unlimited annotation tracks for both columns and rows. The annotations are coloured differently according to their type (factor or numeric covariate). Although it uses grid graphics, the generated plot is compatible with base layouts such as the ones defined with 'mfrow' or layout, enabling the easy drawing of multiple heatmaps on a single a plot – at last!.

aheatmap(
  x,
  color = "-RdYlBu2:100",
  na.color = NA,
  type = c("rect", "circle", "roundrect"),
  breaks = NA,
  border_color = NA,
  cellwidth = NA,
  cellheight = NA,
  scale = "none",
  Rowv = TRUE,
  Colv = TRUE,
  revC = identical(Colv, "Rowv") || is_NA(Rowv) || (is.integer(Rowv) && length(Rowv) >
    1) || is(Rowv, "silhouette"),
  distfun = "euclidean",
  hclustfun = "complete",
  reorderfun = function(d, w) reorder(d, w),
  treeheight = 50,
  legend = TRUE,
  annCol = NA,
  annRow = NA,
  annColors = NA,
  annLegend = TRUE,
  cexAnn = NA,
  dataRow = NULL,
  dataCol = NULL,
  labRow = NULL,
  labCol = NULL,
  labAnn = annLegend,
  subsetRow = NULL,
  subsetCol = NULL,
  y = NULL,
  txt = NULL,
  layout = ".",
  fontsize = 10,
  cexRow = 0.9,
  cexCol = 0.9,
  filename = NA,
  width = NA,
  height = NA,
  main = NULL,
  sub = NULL,
  info = NULL,
  verbose = getOption("verbose"),
  trace = verbose > 1,
  add = NULL,
  gp = gpar()
)

`x`	numeric matrix of the values to be plotted. An `ExpressionSet` objects can also be passed, in which case the expression values are plotted (`exprs(x)`).
`color`	colour specification for the heatmap. Default to palette '-RdYlBu2:100', i.e. reversed palette 'RdYlBu2' (a slight modification of RColorBrewer's palette 'RdYlBu') with 100 colors. Possible values are: a character/integer vector of length greater than 1 that is directly used and assumed to contain valid R color specifications. a single color/integer (between 0 and 8)/other numeric value that gives the dominant colors. Numeric values are converted into a pallete by `rev(sequential_hcl(2, h = x, l = c(50, 95)))`. Other values are concatenated with the grey colour '#F1F1F1'. RColorBrewer palette names (see `display.brewer.all`) `viridis` palette names: 'viridis', 'inferno', 'plasma', 'magma'; one of 'RdYlBu2', 'rainbow', 'heat', 'topo', 'terrain', 'cm'. When the colour palette is specified with a single value, and is negative or preceded a minus ('-'), the reversed palette is used. The number of breaks can also be specified after a colon (':'). For example, the default colour palette is specified as '-RdYlBu2:100'.
`na.color`	Specifies the colour to use for `NA` values. Setting to `NA` (default) produces uncoloured cells (white). It can also be a list of 2 elements, with the first element specifying the color and the second a given value or a range of values (as a 2-length vector) to be forced to NA.
`type`	type of cell shapes (still experimental feature).
`breaks`	a sequence of numbers that covers the range of values in `x` and is one element longer than color vector. Used for mapping values to colors. Useful, if needed to map certain values to certain colors. If NA then the breaks are calculated automatically. If `breaks` is a single value, then the colour palette is forced to be centered on this value.
`border_color`	color of cell borders on heatmap, use NA if no border should be drawn. This argument allows for a finer control of borders for the following elements: the matrix grid (`'grid'`), the matrix surrounding border (`'matrix'`), the annotation cells (`'annCol'`, `'annRow'` or `'ann'` for columns, rows or both, respectively), the annotation legend (`'annLegend'`) or the color scale legend (`'legend'`) Additionally, borders for all matrix cells (`'cell'`) and the edges between cell centers (`'edge'`) can be controlled but must be explicitely specified, either separately or with `border_color=''`, which draws borders around all elements. Using `border_color=TRUE` or some color specification will not draw them. The following special syntax is also supported: `'[<colorcode>:]<element>'` for coloring element '<element>', optionally specifying the color before `':'`. Multiple element names can be passed separated by commas (spaces are stripped). See examples in the aheatmap* demo and vignette.
`cellwidth`	individual cell width in points. If left as NA, then the values depend on the size of plotting window.
`cellheight`	individual cell height in points. If left as NA, then the values depend on the size of plotting window.
`scale`	character indicating how the values should scaled in either the row direction or the column direction. Note that the scaling is performed after row/column clustering, so that it has no effect on the row/column ordering. Possible values are: `"row"`: center and standardize each row separately to row Z-scores `"stdrow"`: center and standardize each row separately to row Z-scores, and force values onto `[0,1]` inteval. `"column"`: center and standardize each column separately to column Z-scores `"stdcolumn"`: center and standardize each column separately to column Z-scores, and force values onto `[0,1]` inteval. `"r1"`: scale each row to sum up to one `"c1"`: scale each column to sum up to one `"none"`: no scaling
`Rowv`	clustering specification(s) for the rows. It allows to specify the distance/clustering/ordering/display parameters to be used for the rows only. See section Row/column ordering and display for details on all supported values.
`Colv`	clustering specification(s) for the columns. It accepts the same values as argument `Rowv` (modulo the expected length for vector specifications), and allow specifying the distance/clustering/ordering/display parameters to be used for the columns only. `Colv` may also be set to `"Rowv"`, in which case the dendrogram or ordering specifications applied to the rows are also applied to the columns. Note that this is allowed only for square matrices, and that the row ordering is in this case by default reversed (`revC=TRUE`) to obtain the diagonal in the standard way (from top-left to bottom-right). See section Row/column ordering and display for details on all supported values.
`revC`	a logical that specify if the row order defined by `Rowv` should be reversed. This is mainly used to get the rows displayed from top to bottom, which is not the case by default. Its default value is computed at runtime, to suit common situations where natural ordering is a more sensible choice: no or fix ordering of the rows (`Rowv=NA` or an integer vector of indexes – of length > 1), and when a symmetric ordering is requested – so that the diagonal is shown as expected. An argument in favor of the "odd" default display (bottom to top) is that the row dendrogram is plotted from bottom to top, and reversing its reorder may take a not too long but non negligeable time.
`distfun`	default distance measure used in clustering rows and columns. Possible values are: all the distance methods supported by `dist` (e.g. "euclidean" or "maximum"). all correlation methods supported by `cor`, such as `"pearson"` or `"spearman"`. The pairwise distances between rows/columns are then computed as `d <- dist(1 - cor(..., method = distfun))`. One may as well use the string "correlation" which is an alias for "pearson". an object of class `dist` such as returned by `dist` or `as.dist`.
`hclustfun`	default clustering method used to cluster rows and columns. Possible values are: a method name (a character string) supported by `hclust` (e.g. `'average'`). an object of class `hclust` such as returned by `hclust` a dendrogram
`reorderfun`	default dendrogram reordering function, used to reorder the dendrogram, when either `Rowv` or `Colv` is a numeric weight vector, or provides or computes a dendrogram. It must take 2 parameters: a dendrogram, and a weight vector.
`treeheight`	how much space (in points) should be used to display dendrograms. If specified as a single value, it is used for both dendrograms. A length-2 vector specifies separate values for the row and column dendrogram respectively. Default value: 50 points.
`legend`	boolean value that determines if a colour ramp for the heatmap's colour palette should be drawn or not. Default is `TRUE`.
`annCol`	specifications of column annotation tracks displayed as coloured rows on top of the heatmaps. The annotation tracks are drawn from bottom to top. A single annotation track can be specified as a single vector; multiple tracks are specified as a list, a data frame, or an `ExpressionSet` object, in which case the phenotypic data is used (`pData(eset)`). Character or integer vectors are converted and displayed as factors. Unnamed tracks are internally renamed into `Xi`, with i being incremented for each unamed track, across both column and row annotation tracks. For each track, if no corresponding colour is specified in argument `annColors`, a palette or a ramp is automatically computed and named after the track's name.
`annRow`	specifications of row annotation tracks displayed as coloured columns on the left of the heatmaps. The annotation tracks are drawn from left to right. The same conversion, renaming and colouring rules as for argument `annCol` apply.
`annColors`	list for specifying annotation track colors manually. It is possible to define the colors for only some of the annotations. Check examples for details.
`annLegend`	specifies if the legend for the annotation tracks should be drawn or not. Default is `TRUE`, which draws legend for both row and column annotations. It can also be one of `'both'` (equivalent to `TRUE`), `'none'` (equivalent to `FALSE`), `'row'` or `'column'`.
`cexAnn`	scaling coefficent for the size of the annotation tracks. Values > 1 (resp. < 1) will increase (resp. decrease) the size of each annotation track. This applies to the height (resp. width) of the column (resp. row) annotation tracks. Separate row and column sizes can be specified as a vector `c(row_size, col_size)`, where an NA value means using the default for the corresponding track.
`dataRow`	`data.frame` where row annotation variables are looked-up. When `x` is an `ExpressionSet` object, this defaults to the feature annotation returned by `fData(x)`.
`dataCol`	`data.frame` where column annotation variables are looked-up. When `x` is an `ExpressionSet` object, this defaults to the phenotypic sample annotation returned by `pData(x)`.
`labRow`	labels for the rows.
`labCol`	labels for the columns. See description for argument `labRow` for a list of the possible values.
`labAnn`	toggles labelling of annotation tracks. It accepts the same values as argument `annLegend`, and specifies which annotation labels should be drawn.
`subsetRow`	Specification of subsetting the rows before drawing the heatmap. Possible values are: an integer vector of length > 1 specifying the indexes of the rows to keep; a character vector of length > 1 specyfing the names of the rows to keep. These are the original rownames, not the names specified in `labRow`. a logical vector of length > 1, whose elements are recycled if the vector has not as many elements as rows in `x`. Note that in the case `Rowv` is a dendrogram or hclust object, it is first converted into an ordering vector, and cannot be displayed – and a warning is thrown.
`subsetCol`	Specification of subsetting the columns before drawing the heatmap. It accepts the similar values as `subsetRow`. See details above.
`y`	an optional matrix that specifies values that are used to compute circle radius when `type = "circle"`, so that color and circle size can be independent. If `NULL`, then radius will be related to the values in the data matrix `x`.
`txt`	character matrix of the same size as `x`, that contains text to display in each cell. `NA` values are allowed and are not displayed. See demo for an example.
`layout`	layout specification that indicates the relative position of the heatmap's components. Two layouts can be defined: one horizontal, which relates to components associated to rows, and one vertical, which relates to components associated with columns. Each layout is specified as a character strings, composed of characters that encode the order of each component: dendrogram (d), annotation tracks (a), data matrix (m), labels (l) and legend (L). See `aheatmap_layout` for more details on layout specifications.
`fontsize`	base fontsize for the plot
`cexRow`	fontsize for the rownames, specified as a fraction of argument `fontsize`.
`cexCol`	fontsize for the colnames, specified as a fraction of argument `fontsize`.
`filename`	file path ending where to save the picture. Currently following formats are supported: png, pdf, tiff, bmp, jpeg. Even if the plot does not fit into the plotting window, the file size is calculated so that the plot would fit there, unless specified otherwise.
`width`	manual option for determining the output file width in
`height`	manual option for determining the output file height in inches.
`main`	Main title as a character string or a grob.
`sub`	Subtitle as a character string or a grob.
`info`	(experimental) Extra information as a character vector or a grob. If `info=TRUE`, information about the clustering methods is displayed at the bottom of the plot.
`verbose`	if `TRUE` then verbose messages are displayed and the borders of some viewports are highlighted. It is entended for debugging purposes.
`trace`	logical that indicates if the different grid viewports should be traced with a blue border (debugging purpose).
`add`	logical that indicates if the plot should be drawn on a fresh new grid page or on the currently opened plot. Using `add = NULL` (default), enables mixing grid and base graphics, and, notably, arrange multiple heatmaps on the same plot following a layout setup via the standard graphical parameter `mfrow` or the function `layout`.
`gp`	graphical parameters for the text used in plot. Parameters passed to `grid.text`, see `gpar`.

The development of this function started as a fork of the function pheatmap from the pheatmap package, and provides several enhancements such as:

argument names match those used in the base function heatmap;
unlimited number of annotation for both columns and rows, with simplified and more flexible interface;
easy specification of clustering methods and colors;
return clustering data, as well as grid grob object.

Please read the associated vignette for more information and sample code.

if plotting on a PDF graphic device – started with pdf, one may get generate a first blank page, due to internals of standard functions from the grid package that are called by aheatmap. The NMF package ships a custom patch that fixes this issue. However, in order to comply with CRAN policies, the patch is not applied by default and the user must explicitly be enabled it. This can be achieved on runtime by either setting the NMF specific option 'grid.patch' via nmf.options(grid.patch=TRUE), or on load time if the environment variable 'R_PACKAGE_NMF_GRID_PATCH' is defined and its value is something that is not equivalent to FALSE (i.e. not ”, 'false' nor 0).

Possible values are:

TRUE or NULL (to be consistent with heatmap): compute a dendrogram from hierarchical clustering using the distance and clustering methods distfun and hclustfun.
NA: disable any ordering. In this case, and if not otherwise specified with argument revC=FALSE, the heatmap shows the input matrix with the rows in their original order, with the first row on top to the last row at the bottom. Note that this differ from the behaviour or heatmap, but seemed to be a more sensible choice when vizualizing a matrix without reordering.
an integer vector of length the number of rows of the input matrix (nrow(x)), that specifies the row order. As in the case Rowv=NA, the ordered matrix is shown first row on top, last row at the bottom.
a character vector or a list specifying values to use instead of arguments distfun, hclustfun and reorderfun when clustering the rows (see the respective argument descriptions for a list of accepted values). If Rowv has no names, then the first element is used for distfun, the second (if present) is used for hclustfun, and the third (if present) is used for reorderfun.
a numeric vector of weights, of length the number of rows of the input matrix, used to reorder the internally computed dendrogram d by reorderfun(d, Rowv).
FALSE: the dendrogram is computed using methods distfun, hclustfun, and reorderfun but is not shown.
a single integer that specifies how many subtrees (i.e. clusters) should be highlighted, e.g., aheatmap(x, Rowv = 3L).

If positive, then the dendrogram's branches upstream each cluster are faded out using dashed lines. If negative, then the dendrogram's branches within each cluster are faded out using dashed lines, keeping the root upstream branches as is.
a single double that specifies how much space is used by the computed dendrogram. That is that this value is used in place of treeheight.
a single character string starting with a '#' or a list with its first element as such a string, e.g., aheatmap(x, Rowv = '#3') or aheatmap(x, Colv = list('#3', text = LETTERS[1:3])).

Original version of pheatmap: Raivo Kolde

Enhancement into aheatmap: Renaud Gaujoux

## See the demo 'aheatmap' for more examples:
## Not run: 
demo('aheatmap')

## End(Not run)

# Generate random data
n <- 50; p <- 20
x <- abs(rmatrix(n, p, rnorm, mean=4, sd=1))
x[1:10, seq(1, 10, 2)] <- x[1:10, seq(1, 10, 2)] + 3
x[11:20, seq(2, 10, 2)] <- x[11:20, seq(2, 10, 2)] + 2
rownames(x) <- paste("ROW", 1:n)
colnames(x) <- paste("COL", 1:p)

## Default heatmap
aheatmap(x)

## Distance methods
aheatmap(x, Rowv = "correlation")
aheatmap(x, Rowv = "man") # partially matched to 'manhattan'
aheatmap(x, Rowv = "man", Colv="binary")

# Generate column annotations
annotation = data.frame(Var1 = factor(1:p %% 2 == 0, labels = c("Class1", "Class2")), Var2 = 1:10)
aheatmap(x, annCol = annotation)