heatmap.mark: Enhanced Heat Map, further modified

Description Usage Arguments Details Value Note Author(s) See Also Examples

Description

This heatmap adds some functional control to the extensions provided by heatmap.2 to the standard R heatmap function. Namely, this function adds the ability to suppress the label of the color key, and modifies the defaults for scale, trace, col, and density.info to match the more common usage in RNAseq analysis. In addition, it allows the suppression of the hardcoded layouts, using plotNew = FALSE to allow combining multiple heatmaps in a single figure, though caution is warranted in arranging your own layout.

Usage and details below are borrowed from that function; for more complete examples, see those help pages.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
heatmap.mark (x,

           # dendrogram control
           Rowv = TRUE,
           Colv=if(symm)"Rowv" else TRUE,
           distfun = dist,
           hclustfun = hclust,
           dendrogram = c("both","row","column","none"),
           symm = FALSE,

           # data scaling
           scale = c("row","none", "column"),
           na.rm=TRUE,

           # image plot
           revC = identical(Colv, "Rowv"),
           add.expr,

           # mapping data to colors
           breaks,
           symbreaks=min(x < 0, na.rm=TRUE) || scale!="none",

           # colors
           col="rnaSeqColors",

           # block sepration
           colsep,
           rowsep,
           sepcolor="white",
           sepwidth=c(0.05,0.05),

           # cell labeling
           cellnote,
           notecex=1.0,
           notecol="cyan",
           na.color=par("bg"),

           # level trace
           trace=c("none","column","row","both"),
           tracecol="cyan",
           hline=median(breaks),
           vline=median(breaks),
           linecol=tracecol,

           # Row/Column Labeling
           margins = c(5, 5),
           ColSideColors,
           RowSideColors,
           cexRow = 0.2 + 1/log10(nr),
           cexCol = 0.2 + 1/log10(nc),
           labRow = NULL,
           labCol = NULL,
           srtRow = NULL,
           srtCol = NULL,
           adjRow = c(0,NA),
           adjCol = c(NA,0),
           offsetRow = 0.5,
           offsetCol = 0.5,

           # color key + density info
           key = TRUE,
           keysize = 1.5,
           density.info=c("none","histogram","density"),
           denscol=tracecol,
           symkey = min(x < 0, na.rm=TRUE) || symbreaks,
           densadj = 0.25,

           # plot labels
           main = NULL,
           xlab = NULL,
           ylab = NULL,

           # plot layout
           lmat = NULL,
           lhei = NULL,
           lwid = NULL,

           # extras for this function
           scaleLabel = NULL,
           plotNew = TRUE,
           ...
           )
           

Arguments

x

numeric matrix of the values to be plotted.

Rowv

determines if and how the row dendrogram should be reordered. By default, it is TRUE, which implies dendrogram is computed and reordered based on row means. If NULL or FALSE, then no dendrogram is computed and no reordering is done. If a dendrogram, then it is used "as-is", ie without any reordering. If a vector of integers, then dendrogram is computed and reordered based on the order of the vector.

Colv

determines if and how the column dendrogram should be reordered. Has the options as the Rowv argument above and additionally when x is a square matrix, Colv = "Rowv" means that columns should be treated identically to the rows.

distfun

function used to compute the distance (dissimilarity) between both rows and columns. Defaults to dist.

hclustfun

function used to compute the hierarchical clustering when Rowv or Colv are not dendrograms. Defaults to hclust.

dendrogram

character string indicating whether to draw 'none', 'row', 'column' or 'both' dendrograms. Defaults to 'both'. However, if Rowv (or Colv) is FALSE or NULL and dendrogram is 'both', then a warning is issued and Rowv (or Colv) arguments are honoured.

symm

logical indicating if x should be treated symmetrically; can only be true when x is a square matrix.

scale

character indicating if the values should be centered and scaled in either the row direction or the column direction, or none. The default is "row" if symm false, and "none" otherwise.

na.rm

logical indicating whether NA's should be removed.

revC

logical indicating if the column order should be reversed for plotting, such that e.g., for the symmetric case, the symmetry axis is as usual.

add.expr

expression that will be evaluated after the call to image. Can be used to add components to the plot.

breaks

(optional) Either a numeric vector indicating the splitting points for binning x into colors, or a integer number of break points to be used, in which case the break points will be spaced equally between min(x) and max(x).

symbreaks

Boolean indicating whether breaks should be made symmetric about 0. Defaults to TRUE if the data includes negative values, and to FALSE otherwise.

col

colors used for the image. Defaults to heat colors (heat.colors).

colsep, rowsep, sepcolor

(optional) vector of integers indicating which columns or rows should be separated from the preceding columns or rows by a narrow space of color sepcolor.

sepwidth

(optional) Vector of length 2 giving the width (colsep) or height (rowsep) the separator box drawn by colsep and rowsep as a function of the width (colsep) or height (rowsep) of a cell. Defaults to c(0.05, 0.05)

cellnote

(optional) matrix of character strings which will be placed within each color cell, e.g. p-value symbols.

notecex

(optional) numeric scaling factor for cellnote items.

notecol

(optional) character string specifying the color for cellnote text. Defaults to "green".

na.color

Color to use for missing value (NA). Defaults to the plot background color.

trace

character string indicating whether a solid "trace" line should be drawn across 'row's or down 'column's, 'both' or 'none'. The distance of the line from the center of each color-cell is proportional to the size of the measurement. Defaults to 'none'.

tracecol

character string giving the color for "trace" line. Defaults to "cyan".

hline, vline, linecol

Vector of values within cells where a horizontal or vertical dotted line should be drawn. The color of the line is controlled by linecol. Horizontal lines are only plotted if trace is 'row' or 'both'. Vertical lines are only drawn if trace 'column' or 'both'. hline and vline default to the median of the breaks, linecol defaults to the value of tracecol.

margins

numeric vector of length 2 containing the margins (see par(mar= *)) for column and row names, respectively.

ColSideColors

(optional) character vector of length ncol(x) containing the color names for a horizontal side bar that may be used to annotate the columns of x.

RowSideColors

(optional) character vector of length nrow(x) containing the color names for a vertical side bar that may be used to annotate the rows of x.

cexRow, cexCol

positive numbers, used as cex.axis in for the row or column axis labeling. The defaults currently only use number of rows or columns, respectively.

labRow, labCol

character vectors with row and column labels to use; these default to rownames(x) or colnames(x), respectively.

srtRow, srtCol

angle of row/column labels, in degrees from horizontal

adjRow, adjCol

2-element vector giving the (left-right, top-bottom) justification of row/column labels (relative to the text orientation).

offsetRow, offsetCol

Number of character-width spaces to place between row/column labels and the edge of the plotting region.

key

logical indicating whether a color-key should be shown.

keysize

numeric value indicating the size of the key

density.info

character string indicating whether to superimpose a 'histogram', a 'density' plot, or no plot ('none') on the color-key.

denscol

character string giving the color for the density display specified by density.info, defaults to the same value as tracecol.

symkey

Boolean indicating whether the color key should be made symmetric about 0. Defaults to TRUE if the data includes negative values, and to FALSE otherwise.

densadj

Numeric scaling value for tuning the kernel width when a density plot is drawn on the color key. (See the adjust parameter for the density function for details.) Defaults to 0.25.

main, xlab, ylab

main, x- and y-axis titles; defaults to none.

lmat, lhei, lwid

visual layout: position matrix, column height, column width. See below for details

scaleLabel

What label should be used for the colorkey? "NULL" suppresses the label

plotNew

Logical. Should this heatmap be drawn on a new plot. If FALSE, you need to provide your own layout that will encompass all plots you intend to put in the figure. Refer to the argument information for lmat, lhei, and lwid as well as the details and examples below for more information on your options for this.

...

additional arguments passed on to image

Details

If either Rowv or Colv are dendrograms they are honored (and not reordered). Otherwise, dendrograms are computed as dd <- as.dendrogram(hclustfun(distfun(X))) where X is either x or t(x).

If either is a vector (of “weights”) then the appropriate dendrogram is reordered according to the supplied values subject to the constraints imposed by the dendrogram, by reorder(dd, Rowv), in the row case. If either is missing, as by default, then the ordering of the corresponding dendrogram is by the mean value of the rows/columns, i.e., in the case of rows, Rowv <- rowMeans(x, na.rm=na.rm). If either is NULL, no reordering will be done for the corresponding side.

If scale="row" the rows are scaled to have mean zero and standard deviation one. There is some empirical evidence from genomic plotting that this is useful.

The default colors range from red to white (heat.colors) and are not pretty. Consider using enhancements such as the RColorBrewer package, http://cran.r-project.org/src/contrib/PACKAGES.html#RColorBrewer to select better colors.

By default four components will be displayed in the plot. At the top left is the color key, top right is the column dendogram, bottom left is the row dendogram, bottom right is the image plot. When RowSideColor or ColSideColor are provided, an additional row or column is inserted in the appropriate location. This layout can be overriden by specifiying appropriate values for lmat, lwid, and lhei. lmat controls the relative postition of each element, while lwid controls the column width, and lhei controls the row height. See the help page for layout for details on how to use these arguments.

If plotNew = FALSE, then heatmap.mark will not reset the current layout before plotting. Thus, if this operates on a brand new plot, each of the four elements (described above) will be plotted as a separate plot. Instead, before running the first plot you intend to include, using layout or a similar function to specify the order in which plots should be placed. See the usage examples below for an example.

Value

Invisibly, a list with components

rowInd

row index permutation vector as returned by order.dendrogram.

colInd

column index permutation vector.

call

the matched call

rowMeans, rowSDs

mean and standard deviation of each row: only present if scale="row"

colMeans, colSDs

mean and standard deviation of each column: only present if scale="column"

carpet

reordered and scaled 'x' values used generate the main 'carpet'

rowDendrogram

row dendrogram, if present

colDendrogram

column dendrogram, if present

breaks

values used for color break points

col

colors used

vline

center-line value used for column trace, present only if trace="both" or trace="column"

hline

center-line value used for row trace, present only if trace="both" or trace="row"

colorTable

A three-column data frame providing the lower and upper bound and color for each bin

Note

The original rows and columns are reordered in any case to match the dendrogram, e.g., the rows by order.dendrogram(Rowv) where Rowv is the (possibly reorder()ed) row dendrogram.

heatmap.2() uses layout and draws the image in the lower right corner of a 2x2 layout. Consequentially, it can not be used in a multi column/row layout, i.e., when par(mfrow= *) or (mfcol= *) has been called.

heatmap.mark() allows this behavior to be over-ridden using plotNew = FALSE, though the user is cautioned that arranging the output manually may take substantial effort.

Author(s)

Mark Peterson, making small revisions to the fantastic code of Andy Liaw, original; and R. Gentleman, M. Maechler, W. Huber, G. Warnes, revisions.

See Also

image, hclust,heatmap.2

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
## Below are examples of the changes made from heatmap.2()
##  for more complete examples of all this code can do
##  see ?heatmap.2

######################################
## Read in and prepare data to plot ##
######################################

## Find where the data is stored (or use your own)
pathToData <- try(system.file("",package="rnaseqWrapper",mustWork=TRUE))

if(class(pathToData) != "try-error"){
## Make sure the data were found before proceeding

## Read in the data
## Note, the files here are compressed,
##  but yours do not need to be
countData <- mergeCountFiles(paste(pathToData,"/data/",sep=""),".genes.results.txt.gz")

## limit to count data for 50 rows
## note that these are not, necessarily DE genes
toPlot <- countData[51:100,grep(".expected_count",names(countData))]

## Trim the names to make the plots a bit nicer:
names(toPlot) <- gsub(".expected_count","", names(toPlot))

#################
## Simple plot ##
#################

heatmap.mark(as.matrix(toPlot),cexCol = 0.75,labRow = FALSE)


#########################################
## More complex, add labels and legend ##
#########################################

myLabelColors <- rep(c("red","blue"),each = dim(toPlot)[2]/2)

heatmap.mark(as.matrix(toPlot),
             cexCol = 0.75,labRow = FALSE,
             scaleLabel = "",
             ColSideColors = myLabelColors)

par(xpd=TRUE) ## To allow legend on top of other stuff
legend(x="topleft",inset=c(-.02,.08), 
       bty="n", cex=.8, 
       legend= c("Female","Male"), 
       fill=unique(myLabelColors), 
       title="Sex") 
par(xpd=FALSE) ## To reset




##########################
## With multiple panels ##
##########################

## Set your own layout
## Note, that each heatmap plots 4 objects when no color labels are included
## So the offset for each additional one needs to b 4 + the options
## If you use row or column labels, additional plots are drawn
## In addition, you will likely want to play with the widths and 
## heights of each element.

baseLayout <- matrix(c(4,3,2,1), nrow = 2, byrow = TRUE)

layout(cbind(baseLayout,baseLayout + 4), 
       widths = c(1,2,1,2), heights = c(1,2), respect = FALSE)

heatmap.mark(as.matrix(toPlot),
             cexCol = 0.75,labRow = FALSE,
             scaleLabel = "",
             plotNew = FALSE)

heatmap.mark(as.matrix(toPlot),
             cexCol = 0.75,labRow = FALSE,
             scaleLabel = "",
             plotNew = FALSE)



}


 

Example output

Loading required package: ecodist
Loading required package: gplots

Attaching package: 'gplots'

The following object is masked from 'package:stats':

    lowess

Loading required package: gtools

rnaseqWrapper documentation built on May 2, 2019, 5:58 a.m.