Heatmaps of NMF Factors

Share:

Description

The NMF package ships an advanced heatmap engine implemented by the function aheatmap. Some convenience heatmap functions have been implemented for NMF models, which redefine default values for some of the arguments of aheatmap, hence tuning the output specifically for NMF models.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
  basismap(object, ...)

  ## S4 method for signature 'NMF'
basismap(object, color = "YlOrRd:50",
    scale = "r1", Rowv = TRUE, Colv = NA,
    subsetRow = FALSE, annRow = NA, annCol = NA,
    tracks = "basis", main = "Basis components",
    info = FALSE, ...)

  coefmap(object, ...)

  ## S4 method for signature 'NMF'
coefmap(object, color = "YlOrRd:50",
    scale = "c1", Rowv = NA, Colv = TRUE, annRow = NA,
    annCol = NA, tracks = "basis",
    main = "Mixture coefficients", info = FALSE, ...)

  consensusmap(object, ...)

  ## S4 method for signature 'NMFfitX'
consensusmap(object, annRow = NA,
    annCol = NA,
    tracks = c("basis:", "consensus:", "silhouette:"),
    main = "Consensus matrix", info = FALSE, ...)

  ## S4 method for signature 'matrix'
consensusmap(object,
    color = "-RdYlBu",
    distfun = function(x) as.dist(1 - x),
    hclustfun = "average", Rowv = TRUE, Colv = "Rowv",
    main = if (is.null(nr) || nr > 1) "Consensus matrix" else "Connectiviy matrix",
    info = FALSE, ...)

  ## S4 method for signature 'NMFfitX'
coefmap(object, Colv = TRUE,
    annRow = NA, annCol = NA,
    tracks = c("basis", "consensus:"), ...)

Arguments

object

an object from which is extracted NMF factors or a consensus matrix

...

extra arguments passed to aheatmap.

subsetRow

Argument that specifies how to filter the rows that will appear in the heatmap. When FALSE (default), all rows are used. Besides the values supported by argument subsetRow of aheatmap, other possible values are:

  • TRUE: only the rows that are basis-specific are used. The default selection method is from KimH2007. This is equivalent to subsetRow='kim'.

  • a single character string or numeric value that specifies the method to use to select the basis-specific rows, that should appear in the heatmap (cf. argument method for function extractFeatures).

    Note extractFeatures is called with argument nodups=TRUE, so that features that are selected for multiple components only appear once.

tracks

Special additional annotation tracks to highlight associations between basis components and sample clusters:

basis

matches each row (resp. column) to the most contributing basis component in basismap (resp. coefmap). In basismap (resp. coefmap), adding a track ':basis' to annCol (resp. annRow) makes the column (resp. row) corresponding to the component being also highlited using the mathcing colours.

info

if TRUE then the name of the algorithm that fitted the NMF model is displayed at the bottom of the plot, if available. Other wise it is passed as is to aheatmap.

color

colour specification for the heatmap. Default to palette '-RdYlBu2:100', i.e. reversed palette 'RdYlBu2' (a slight modification of RColorBrewer's palette 'RdYlBu') with 100 colors. Possible values are:

  • a character/integer vector of length greater than 1 that is directly used and assumed to contain valid R color specifications.

  • a single color/integer (between 0 and 8)/other numeric value that gives the dominant colors. Numeric values are converted into a pallete by rev(sequential_hcl(2, h = x, l = c(50, 95))). Other values are concatenated with the grey colour '#F1F1F1'.

  • one of RColorBrewer's palette name (see display.brewer.all) , or one of 'RdYlBu2', 'rainbow', 'heat', 'topo', 'terrain', 'cm'.

When the coluor palette is specified with a single value, and is negative or preceded a minus ('-'), the reversed palette is used. The number of breaks can also be specified after a colon (':'). For example, the default colour palette is specified as '-RdYlBu2:100'.

scale

character indicating how the values should scaled in either the row direction or the column direction. Note that the scaling is performed after row/column clustering, so that it has no effect on the row/column ordering. Possible values are:

  • "row": center and standardize each row separately to row Z-scores

  • "column": center and standardize each column separately to column Z-scores

  • "r1": scale each row to sum up to one

  • "c1": scale each column to sum up to one

  • "none": no scaling

Rowv

clustering specification(s) for the rows. It allows to specify the distance/clustering/ordering/display parameters to be used for the rows only. Possible values are:

  • TRUE or NULL (to be consistent with heatmap): compute a dendrogram from hierarchical clustering using the distance and clustering methods distfun and hclustfun.

  • NA: disable any ordering. In this case, and if not otherwise specified with argument revC=FALSE, the heatmap shows the input matrix with the rows in their original order, with the first row on top to the last row at the bottom. Note that this differ from the behaviour or heatmap, but seemed to be a more sensible choice when vizualizing a matrix without reordering.

  • an integer vector of length the number of rows of the input matrix (nrow(x)), that specifies the row order. As in the case Rowv=NA, the ordered matrix is shown first row on top, last row at the bottom.

  • a character vector or a list specifying values to use instead of arguments distfun, hclustfun and reorderfun when clustering the rows (see the respective argument descriptions for a list of accepted values). If Rowv has no names, then the first element is used for distfun, the second (if present) is used for hclustfun, and the third (if present) is used for reorderfun.

  • a numeric vector of weights, of length the number of rows of the input matrix, used to reorder the internally computed dendrogram d by reorderfun(d, Rowv).

  • FALSE: the dendrogram is computed using methods distfun, hclustfun, and reorderfun but is not shown.

  • a single integer that specifies how many subtrees (i.e. clusters) from the computed dendrogram should have their root faded out. This can be used to better highlight the different clusters.

  • a single double that specifies how much space is used by the computed dendrogram. That is that this value is used in place of treeheight.

Colv

clustering specification(s) for the columns. It accepts the same values as argument Rowv (modulo the expected length for vector specifications), and allow specifying the distance/clustering/ordering/display parameters to be used for the columns only. Colv may also be set to "Rowv", in which case the dendrogram or ordering specifications applied to the rows are also applied to the columns. Note that this is allowed only for square input matrices, and that the row ordering is in this case by default reversed (revC=TRUE) to obtain the diagonal in the standard way (from top-left to bottom-right). See argument Rowv for other possible values.

annRow

specifications of row annotation tracks displayed as coloured columns on the left of the heatmaps. The annotation tracks are drawn from left to right. The same conversion, renaming and colouring rules as for argument annCol apply.

annCol

specifications of column annotation tracks displayed as coloured rows on top of the heatmaps. The annotation tracks are drawn from bottom to top. A single annotation track can be specified as a single vector; multiple tracks are specified as a list, a data frame, or an ExpressionSet object, in which case the phenotypic data is used (pData(eset)). Character or integer vectors are converted and displayed as factors. Unnamed tracks are internally renamed into Xi, with i being incremented for each unamed track, across both column and row annotation tracks. For each track, if no corresponding colour is specified in argument annColors, a palette or a ramp is automatically computed and named after the track's name.

main

Main title as a character string or a grob.

distfun

default distance measure used in clustering rows and columns. Possible values are:

  • all the distance methods supported by dist (e.g. "euclidean" or "maximum").

  • all correlation methods supported by cor, such as "pearson" or "spearman". The pairwise distances between rows/columns are then computed as d <- dist(1 - cor(..., method = distfun)).

    One may as well use the string "correlation" which is an alias for "pearson".

  • an object of class dist such as returned by dist or as.dist.

hclustfun

default clustering method used to cluster rows and columns. Possible values are:

  • a method name (a character string) supported by hclust (e.g. 'average').

  • an object of class hclust such as returned by hclust

  • a dendrogram

Details

IMPORTANT: although they essentially have the same set of arguments, their order sometimes differ between them, as well as from aheatmap. We therefore strongly recommend to use fully named arguments when calling these functions.

basimap default values for the following arguments of aheatmap:

  • the color palette;

  • the scaling specification, which by default scales each row separately so that they sum up to one (scale='r1');

  • the column ordering which is disabled;

  • allowing for passing feature extraction methods in argument subsetRow, that are passed to extractFeatures. See argument description here and therein.

  • the addition of a default named annotation track, that shows the dominant basis component for each row (i.e. each feature).

    This track is specified in argument tracks (see its argument description). By default, a matching column annotation track is also displayed, but may be disabled using tracks=':basis'.

  • a suitable title and extra information like the fitting algorithm, when object is a fitted NMF model.

coefmap redefines default values for the following arguments of aheatmap:

  • the color palette;

  • the scaling specification, which by default scales each column separately so that they sum up to one (scale='c1');

  • the row ordering which is disabled;

  • the addition of a default annotation track, that shows the most contributing basis component for each column (i.e. each sample).

    This track is specified in argument tracks (see its argument description). By default, a matching row annotation track is also displayed, but can be disabled using tracks='basis:'.

  • a suitable title and extra information like the fitting algorithm, when object is a fitted NMF model.

consensusmap redefines default values for the following arguments of aheatmap:

  • the colour palette;

  • the column ordering which is set equal to the row ordering, since a consensus matrix is symmetric;

  • the distance and linkage methods used to order the rows (and columns). The default is to use 1 minus the consensus matrix itself as distance, and average linkage.

  • the addition of two special named annotation tracks, 'basis:' and 'consensus:', that show, for each column (i.e. each sample), the dominant basis component in the best fit and the hierarchical clustering of the consensus matrix respectively (using 1-consensus as distance and average linkage).

    These tracks are specified in argument tracks, which behaves as in basismap.

  • a suitable title and extra information like the type of NMF model or the fitting algorithm, when object is a fitted NMF model.

Methods

basismap

signature(object = "NMF"): Plots a heatmap of the basis matrix of the NMF model object. This method also works for fitted NMF models (i.e. NMFfit objects).

basismap

signature(object = "NMFfitX"): Plots a heatmap of the basis matrix of the best fit in object.

coefmap

signature(object = "NMF"): The default method for NMF objects has special default values for some arguments of aheatmap (see argument description).

coefmap

signature(object = "NMFfitX"): Plots a heatmap of the coefficient matrix of the best fit in object.

This method adds:

  • an extra special column annotation track for multi-run NMF fits, 'consensus:', that shows the consensus cluster associated to each sample.

  • a column sorting schema 'consensus' that can be passed to argument Colv and orders the columns using the hierarchical clustering of the consensus matrix with average linkage, as returned by consensushc(object). This is also the ordering that is used by default for the heatmap of the consensus matrix as ploted by consensusmap.

consensusmap

signature(object = "NMFfitX"): Plots a heatmap of the consensus matrix obtained when fitting an NMF model with multiple runs.

consensusmap

signature(object = "NMF"): Plots a heatmap of the connectivity matrix of an NMF model.

consensusmap

signature(object = "matrix"): Main method that redefines default values for arguments of aheatmap.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
#----------
# heatmap-NMF
#----------
## More examples are provided in demo `heatmaps`
## Not run: 
demo(heatmaps)

## End(Not run)
##

# random data with underlying NMF model
v <- syntheticNMF(20, 3, 10)
# estimate a model
x <- nmf(v, 3)

#----------
# basismap
#----------
# show basis matrix
basismap(x)
## Not run: 
# without the default annotation tracks
basismap(x, tracks=NA)

## End(Not run)

#----------
# coefmap
#----------
# coefficient matrix
coefmap(x)
## Not run: 
# without the default annotation tracks
coefmap(x, tracks=NA)

## End(Not run)

#----------
# consensusmap
#----------
## Not run: 
res <- nmf(x, 3, nrun=3)
consensusmap(res)

## End(Not run)

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.