plotClusters: Visualize cluster assignments across multiple clusterings
In epurdom/clusterCells: Compare Clusterings for Single-Cell Sequencing

plotClusters

R Documentation

Visualize cluster assignments across multiple clusterings

Description

Align multiple clusterings of the same set of samples and provide a color-coded plot of their shared cluster assignments

Usage

## S4 method for signature 'ClusterExperiment'
plotClusters(
  object,
  whichClusters,
  existingColors = c("ignore", "all", "firstOnly"),
  resetNames = FALSE,
  resetColors = FALSE,
  resetOrderSamples = FALSE,
  colData = NULL,
  clusterLabels = NULL,
  ...
)

## S4 method for signature 'matrix'
plotClusters(
  object,
  orderSamples = NULL,
  colData = NULL,
  reuseColors = FALSE,
  matchToTop = FALSE,
  plot = TRUE,
  unassignedColor = "white",
  missingColor = "grey",
  minRequireColor = 0.3,
  startNewColors = FALSE,
  colPalette = massivePalette,
  input = c("clusters", "colors"),
  clusterLabels = colnames(object),
  add = FALSE,
  xCoord = NULL,
  ylim = NULL,
  tick = FALSE,
  ylab = "",
  xlab = "",
  axisLine = 0,
  box = FALSE,
  ...
)

Arguments

`object`	A matrix of with each column corresponding to a clustering and each row a sample or a `ClusterExperiment` object. If a matrix, the function will plot the clusterings in order of this matrix, and their order influences the plot greatly.
`whichClusters`	argument that can be either numeric or character vector indicating the clusterings to be used. See details of `getClusterIndex`.
`existingColors`	how to make use of the exiting colors in the `ClusterExperiment` object. 'ignore' will ignore them and assign new colors. 'firstOnly' will use the existing colors of only the 1st clustering, and then align the remaining clusters and give new colors for the remaining only. 'all' will use all of the existing colors.
`resetNames`	logical. Whether to reset the names of the clusters in `clusterLegend` to be the aligned integer-valued ids from `plotClusters`.
`resetColors`	logical. Whether to reset the colors in `clusterLegend` in the `ClusterExperiment` returned to be the colors from the `plotClusters`.
`resetOrderSamples`	logical. Whether to replace the existing `orderSamples` slot in the `ClusterExperiment` object with the new order found.
`colData`	If `clusters` is a matrix, `colData` gives a matrix of additional cluster/sample data on the samples to be plotted with the clusterings given in clusters. Values in `colData` will be added to the end (bottom) of the plot. NAs in the `colData` matrix will trigger an error. If `clusters` is a `ClusterExperiment` object, the input in `colData` refers to columns of the the `colData` slot of the `ClusterExperiment` object to be plotted with the clusters. In this case, `colData` can be TRUE (i.e. all columns will be plotted), or an index or a character vector that references a column or column name, respectively, of the `colData` slot of the `ClusterExperiment` object. If there are NAs in the `colData` columns, they will be encoded as 'unassigned' and receive the same color as 'unassigned' samples in the clustering.
`clusterLabels`	names to go with the columns (clusterings) in matrix `colorMat`. If `colData` argument is not `NULL`, the `clusterLabels` argument must include names for the sample data too. If the user gives only names for the clusterings, the code will try to anticipate that and use the column names of the sample data, but this is fragile. If set to `FALSE`, then no labels will be plotted.
`...`	for `plotClusters` arguments passed either to the method of `plotClusters` for matrices, or ultimately to `plot` (if `add=FALSE`).
`orderSamples`	A predefined order in which the samples will be plotted. Otherwise the order will be found internally by aligning the clusters (assuming `input="clusters"`)
`reuseColors`	Logical. Whether each row should consist of the same set of colors. By default (FALSE) each cluster that the algorithm doesn't identify to the previous rows clusters gets a new color.
`matchToTop`	Logical as to whether all clusters should be aligned to the first row. By default (FALSE) each cluster is aligned to the ordered clusters of the row above it.
`plot`	Logical as to whether a plot should be produced.
`unassignedColor`	If “-1” in `clusters`, will be given this color (meant for samples not assigned to cluster).
`missingColor`	If “-2” in clusters, will be given this color (meant for samples that were missing from the clustering, mainly when comparing clusterings run on different sets of samples)
`minRequireColor`	In aligning colors between rows of clusters, require this percent overlap.
`startNewColors`	logical, indicating whether in aligning colors between rows of clusters, should the colors restart at beginning of colPalette as long as colors are not in immediately proceeding row (the colors at the end of `massivePalette` are all of `colors()` and many will be indistinguishable, so this option can be useful if you have a large cluster matrix).
`colPalette`	a vector of colors used for the different clusters. Must be as long as the maximum number of clusters found in any single clustering/column given in `clusters` or will otherwise return an error.
`input`	indicate whether the input matrix is matrix of integer assigned clusters, or contains the colors. If `input="colors"`, then the object `clusters` is a matrix of colors and there is no alignment (this option allows the user to manually adjust the colors and replot, for example).
`add`	whether to add to existing plot.
`xCoord`	values on x-axis at which to plot the rows (samples).
`ylim`	vector of limits of y-axis.
`tick`	logical, whether to draw ticks on x-axis for each sample.
`ylab`	character string for the label of y-axis.
`xlab`	character string for the label of x-axis.
`axisLine`	the number of lines in the axis labels on y-axis should be (passed to `line = ...` in the axis call).
`box`	logical, whether to draw box around the plot.

Details

All arguments of the matrix version can be passed to the ClusterExperiment version. As noted above, however, some arguments have different interpretations.

If whichClusters = "workflow", then the workflow clusterings will be plotted in the following order: final, mergeClusters, makeConsensus, clusterMany.

Value

If clusters is a ClusterExperiment Object, then plotClusters returns an updated ClusterExperiment object, where the clusterLegend and/or orderSamples slots have been updated (depending on the arguments).

If clusters is a matrix, plotClusters returns (invisibly) the orders and other things that go into making the matrix. Specifically, a list with the following elements.

orderSamples a vector of length equal to nrows(clusters) giving the order of the samples (rows) to use to get the original clusters matrix into the order made by plotClusters.
colors matrix of color assignments for each element of original clusters matrix. Matrix is in the same order as original clusters matrix. The matrix colors[orderSamples,] is the matrix that can be given back to plotClusters to recreate the plot (see examples).
alignedClusterIds a matrix of integer valued cluster assignments that match the colors. This is useful if you want to have cluster identification numbers that are better aligned than that given in the original clusters. Again, the rows/samples are in same order as original input matrix.
clusterLegend list of length equal to the number of columns of input matrix. The elements of the list are matrices, each with three columns named "Original","Aligned", and "Color" giving, respectively, the correspondence between the original cluster ids in clusters, the aligned cluster ids in aligned, and the color.
origClustersThe original matrix of clusters given to plotClusters

Author(s)

Elizabeth Purdom and Marla Johnson (based on the tracking plot in ConsensusClusterPlus by Matt Wilkerson and Peter Waltman).

References

Wilkerson, D. M, Hayes and Neil D (2010). "ConsensusClusterPlus: a class discovery tool with confidence assessments and item tracking." Bioinformatics, 26(12), pp. 1572-1573.

Examples

## Not run: 
#clustering using pam: try using different dimensions of pca and different k
data(simData)

cl <- clusterMany(simData, nReducedDims=c(5, 10, 50), reduceMethod="PCA",
clusterFunction="pam", ks=2:4, findBestK=c(TRUE,FALSE),
removeSil=c(TRUE,FALSE), makeMissingDiss=TRUE)

clusterLabels(cl)

#make names shorter for better plotting
x <- clusterLabels(cl)
x <- gsub("TRUE", "T", x)
x <- gsub("FALSE", "F", x)
x <- gsub("k=NA,", "", x)
x <- gsub("Features", "", x)
clusterLabels(cl) <- x

par(mar=c(2,10,1,1))
#this will make the choices of plotClusters
cl <- plotClusters(cl, axisLine=-1, resetOrderSamples=TRUE, resetColors=TRUE)

#see the new cluster colors
clusterLegend(cl)[1:2]

#We can also change the order of the clusterings. Notice how this
#dramatically changes the plot!
clOrder <- c(3:6, 1:2, 7:ncol(clusterMatrix(cl)))
cl <- plotClusters(cl, whichClusters=clOrder, resetColors=TRUE,
resetOrder=TRUE, axisLine=-2)

#We can manually switch the red ("#E31A1C") and green ("#33A02C") in the
#first cluster:

#see what the default colors are and their names
showPalette(wh=1:5)

#change "#E31A1C" to "#33A02C"
newColorMat <- clusterLegend(cl)[[clOrder[1]]]
newColorMat[2:3, "color"] <- c("#33A02C", "#E31A1C")
clusterLegend(cl)[[clOrder[1]]]<-newColorMat

#replot by setting 'input="colors"'
par(mfrow=c(1,2))
plotClusters(cl, whichClusters=clOrder, orderSamples=orderSamples(cl),
existingColors="all")
plotClusters(cl, whichClusters=clOrder, resetColors=TRUE, resetOrder=TRUE,
axisLine=-2)
par(mfrow=c(1,1))

#set some of clusterings arbitrarily to "-1", meaning not clustered (white),
#and "-2" (another possible designation getting gray, usually for samples not
#included in original clustering)
clMatNew <- apply(clusterMatrix(cl), 2, function(x) {
wh <- sample(1:nSamples(cl), size=10)
x[wh]<- -1
wh <- sample(1:nSamples(cl), size=10)
x[wh]<- -2
return(x)
})

#make a new object
cl2 <- ClusterExperiment(assay(cl), clMatNew,
transformation=transformation(cl))
plotClusters(cl2)

## End(Not run)

epurdom/clusterCells documentation built on April 28, 2024, 8:14 p.m.