Visualize cluster assignments across multiple clusterings

Description

Align multiple clusterings of the same set of samples and provide a color-coded plot of their shared cluster assignments

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
## S4 method for signature 'ClusterExperiment,character'
plotClusters(clusters,
  whichClusters = c("workflow", "all"), ...)

## S4 method for signature 'ClusterExperiment,numeric'
plotClusters(clusters, whichClusters,
  existingColors = c("ignore", "all"), resetNames = FALSE,
  resetColors = FALSE, resetOrderSamples = FALSE, sampleData = NULL, ...)

## S4 method for signature 'ClusterExperiment,missing'
plotClusters(clusters, whichClusters, ...)

## S4 method for signature 'matrix,missing'
plotClusters(clusters, whichClusters,
  orderSamples = NULL, sampleData = NULL, reuseColors = FALSE,
  matchToTop = FALSE, plot = TRUE, unassignedColor = "white",
  missingColor = "grey", minRequireColor = 0.3, startNewColors = FALSE,
  colPalette = bigPalette, input = c("clusters", "colors"),
  clNames = colnames(clusters), add = FALSE, xCoord = NULL, ylim = NULL,
  tick = FALSE, ylab = "", xlab = "", axisLine = 0, box = FALSE, ...)

Arguments

clusters

A matrix of with each column corresponding to a clustering and each row a sample or a ClusterExperiment object. If a matrix, the function will plot the clusterings in order of this matrix, and their order influences the plot greatly.

whichClusters

If numeric, a predefined order for the clusterings in the plot. If x is a ClusterExperiment object, whichClusters can be a character value identifying the clusterTypess to be used; alternatively whichClusters can be either 'all' or 'workflow' to indicate choosing all clusters or choosing all workflowClusters.

...

for plotClusters arguments passed either to the method of plotClusters for matrices, or ultimately to plot (if add=FALSE).

existingColors

how to make use of the exiting colors in the ClusterExperiment object. 'ignore' will ignore them and assign new colors. 'firstOnly' will use the existing colors of only the 1st clustering, and then give new colors for the remaining (not implemented yet). 'all' will use all of the existing colors.

resetNames

logical. Whether to reset the names of the clusters in clusterLegend to be the aligned integer-valued ids from plotClusters.

resetColors

logical. Whether to reset the colors in clusterLegend in the ClusterExperiment returned to be the colors from the plotClusters.

resetOrderSamples

logical. Whether to replace the existing orderSamples slot in the ClusterExperiment object with the new order found.

sampleData

If clusters is a matrix, sampleData gives a matrix of additional cluster/sampleData on the samples to be plotted with the clusterings given in clusters. Values in sampleData will be added to the end (bottom) of the plot. If clusters is a ClusterExperiment object, sampleData must be either an index or a character vector that references a column or column name, respectively, of the colData slot of the ClusterExperiment object.

orderSamples

A predefined order in which the samples will be plotted. Otherwise the order will be found internally by aligning the clusters (assuming input="clusters")

reuseColors

Logical. Whether each row should consist of the same set of colors. By default (FALSE) each cluster that the algorithm doesn't identify to the previous rows clusters gets a new color.

matchToTop

Logical as to whether all clusters should be aligned to the first row. By default (FALSE) each cluster is aligned to the ordered clusters of the row above it.

plot

Logical as to whether a plot should be produced.

unassignedColor

If “-1” in clusters, will be given this color (meant for samples not assigned to cluster).

missingColor

If “-2” in clusters, will be given this color (meant for samples that were missing from the clustering, mainly when comparing clusterings run on different sets of samples)

minRequireColor

In aligning colors between rows of clusters, require this percent overlap.

startNewColors

logical, indicating whether in aligning colors between rows of clusters, should the colors restart at beginning of colPalette as long as colors are not in immediately proceeding row (some of the colors at the end of bigPalette are a bit wonky, and so if you have a large clusters matrix, this can be useful).

colPalette

a vector of colors used for the different clusters. Must be as long as the maximum number of clusters found in any single clustering/column given in clusters or will otherwise return an error.

input

indicate whether the input matrix is matrix of integer assigned clusters, or contains the colors. If input="colors", then the object clusters is a matrix of colors and there is no alignment (this option allows the user to manually adjust the colors and replot, for example).

clNames

names to go with the columns (clusterings) in matrix colorMat.

add

whether to add to existing plot.

xCoord

values on x-axis at which to plot the rows (samples).

ylim

vector of limits of y-axis.

tick

logical, whether to draw ticks on x-axis for each sample.

ylab

character string for the label of y-axis.

xlab

character string for the label of x-axis.

axisLine

the number of lines in the axis labels on y-axis should be (passed to line = ... in the axis call).

box

logical, whether to draw box around the plot.

Details

All arguments of the matrix version can be passed to the ClusterExperiment version. As noted above, however, some arguments have different interpretations.

If whichClusters = "workflow", then the workflow clusterings will be plotted in the following order: final, mergeClusters, combineMany, clusterMany.

Value

If clusters is a ClusterExperiment Object, then plotClusters returns an updated ClusterExperiment object, where the clusterLegend and/or orderSamples slots have been updated (depending on the arguments).

If clusters is a matrix, plotClusters returns (invisibly) the orders and other things that go into making the matrix. Specifically, a list with the following elements.

  • index a vector of length equal to ncols(clusters) giving the order of the columns to use to get the original clusters matrix into the order made by plotClusters.

  • colors matrix of color assignments for each element of original clusters matrix. Matrix is in the same order as original clusters matrix. The matrix colors[index,] is the matrix that can be given back to plotClusters to recreate the plot (see examples).

  • alignedClusterIds a matrix of integer valued cluster assignments that match the colors. This is useful if you want to have cluster identification numbers that are better aligned than that given in the original clusters. Again, the matrix is in same order as original input matrix.

  • clusterLegend list of length equal to the number of columns of input matrix. The elements of the list are matrices, each with three columns named "Original","Aligned", and "Color" giving, respectively, the correspondence between the original cluster ids in clusters, the aligned cluster ids in aligned, and the color.

Author(s)

Elizabeth Purdom and Marla Johnson (based on the tracking plot in ConsensusClusterPlus by Matt Wilkerson and Peter Waltman).

See Also

The ConsensusClusterPlus package.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
#clustering using pam: try using different dimensions of pca and different k
data(simData)

cl <- clusterMany(simData, nPCADims=c(5, 10, 50), dimReduce="PCA",
clusterFunction="pam", ks=2:4, findBestK=c(TRUE,FALSE),
removeSil=c(TRUE,FALSE))

clusterLabels(cl)

#make names shorter for better plotting
x <- clusterLabels(cl)
x <- gsub("TRUE", "T", x)
x <- gsub("FALSE", "F", x)
x <- gsub("k=NA,", "", x)
x <- gsub("Features", "", x)
clusterLabels(cl) <- x

par(mar=c(2,10,1,1))
#this will make the choices of plotClusters
cl <- plotClusters(cl, axisLine=-1, resetOrderSamples=TRUE, resetColors=TRUE)

#see the new cluster colors
clusterLegend(cl)[1:2]

#We can also change the order of the clusterings. Notice how this
#dramatically changes the plot!
clOrder <- c(3:6, 1:2, 7:ncol(clusterMatrix(cl)))
cl <- plotClusters(cl, whichClusters=clOrder, resetColors=TRUE,
resetOrder=TRUE, axisLine=-2)

#We can manually switch the red ("#E31A1C") and green ("#33A02C") in the
#first cluster:

#see what the default colors are and their names
showBigPalette(wh=1:5)

#change "#E31A1C" to "#33A02C"
newColorMat <- clusterLegend(cl)[[clOrder[1]]]
newColorMat[2:3, "color"] <- c("#33A02C", "#E31A1C")
clusterLegend(cl)[[clOrder[1]]]<-newColorMat

#replot by setting 'input="colors"'
par(mfrow=c(1,2))
plotClusters(cl, whichClusters=clOrder, orderSamples=orderSamples(cl),
existingColors="all")
plotClusters(cl, whichClusters=clOrder, resetColors=TRUE, resetOrder=TRUE,
axisLine=-2)
par(mfrow=c(1,1))

#set some of clusterings arbitrarily to "-1", meaning not clustered (white),
#and "-2" (another possible designation getting gray, usually for samples not
#included in original clustering)
clMatNew <- apply(clusterMatrix(cl), 2, function(x) {
wh <- sample(1:nSamples(cl), size=10)
x[wh]<- -1
wh <- sample(1:nSamples(cl), size=10)
x[wh]<- -2
return(x)
})

#make a new object
cl2 <- clusterExperiment(assay(cl), clMatNew,
transformation=transformation(cl))
plotClusters(cl2)