bootplot: function to make a barplot of bootstrap estimated cluster...

Description Usage Arguments Details Value Note Author(s) References See Also Examples

View source: R/bootplot.R

Description

After clustering, the boothopach or bootmedoids function can be used to estimated the membership of each element being clustered in each of the identified clusters (fuzzy clustering). The proportion of bootstrap resampled data sets in which each element is assigned to each cluster is called the "reappearance proportion" for the element and that cluster. This function plots these proportions in a colored barplot.

Usage

1
2
bootplot(bootobj, hopachobj, ord = "bootp", main = NULL, labels = NULL, 
showclusters = TRUE, ...)

Arguments

bootobj

output of boothopach or bootmedoids applied to the genes - a matrix of bootstrap estimated cluster membership probabilities, with a row for each row in data and a column for each cluster.

hopachobj

output of the hopach function. If bootobj was generated using bootmedoids (i.e. hopach was not run), then the bootplot function can be used by creating a hopachobj which is a list with at least the following two components: hopachobj$clustering$sizes (number of elements in each cluster - length should be ncol(bootobj) and hopachobj$clustering$order (an ordering of the elements so that elements in the same cluster appear next to each other and elements may also be ordered within cluster). By changing the value of hopachobj$clustering$order, the order of the elements in the barplot can be altered.

ord

character string indicating how to order the elements (rows) in the barplot. If ord="none", then the elements are plotted in the same order as in bootobj, i.e. the same order as the original data matrix. If ord="final", the ordering of elements in the final level of the hopach hierarchical tree is used. If ord="cluster", the ordering from the level of the hopach tree corresponding to the main clusters is used. If ord="bootp", the elements are ordered first by main cluster and then by bootstrap reappearance proportion within cluster, so that elements with the highest membership in the cluster appear at the bottom. In the last three cases, the elements from each cluster will be contiguous. If ord="final", then the medoid element will appear in the middle of each cluster. If ord="clust", the ordering depends on the value of the ord argument passed to the hopach function. For example, when ord="own" in hopach, the elements are ordered within cluster based on distance to the medoid, so that the medoid appears first (at the bottom) in the cluster.

main

character string to be used as the main title

labels

a vector of labels for the elements being clustered to be used on the axes. If the number of elements is lager than 50, the labels are not shown.

showclusters

indicator of whether or not to show the cluster boundaries on the plot. If show.clusters=TRUE, solid lines are drawn at the edges of the clusters.

...

additional arguments to the barplot plotting function

Details

Each cluster (column of bootobj) is represented by a color. The proportion of bootstrap resampled data sets in which an element appeared in that cluster determines the proportion of the bar for that element which is the corresponding color. As a key, the clusters are labeled on the right margin in text of the same color.

Value

The function bootplot has no value. It does generate a plot.

Note

Thank you to Sandrine Dudoit <sandrine@stat.berkeley.edu> for her input and to Jenny Bryan for the original clusplot code.

Author(s)

Katherine S. Pollard <kpollard@gladstone.ucsf.edu>

References

van der Laan, M.J. and Pollard, K.S. A new algorithm for hybrid hierarchical clustering with visualization and the bootstrap. Journal of Statistical Planning and Inference, 2003, 117, pp. 275-303.

http://www.stat.berkeley.edu/~laan/Research/Research_subpages/Papers/hopach.pdf

See Also

hopach, boothopach, bootmedoids, barplot

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
mydata<-rbind(cbind(rnorm(10,0,0.5),rnorm(10,0,0.5),rnorm(10,0,0.5)),cbind(rnorm(15,5,0.5),rnorm(15,5,0.5),rnorm(15,5,0.5)))
dimnames(mydata)<-list(paste("Var",1:25,sep=""),paste("Exp",1:3,sep=""))
mydist<-distancematrix(mydata,d="euclid")

#hopach clustering
clustresult<-hopach(mydata,dmat=mydist)

#bootstrap
myobj<-boothopach(mydata,clustresult)

#plots
bootplot(myobj,clustresult,showclusters=FALSE)
bootplot(myobj,clustresult,labels=paste("Sample",LETTERS[1:25],sep=" "))

Example output

Loading required package: cluster
Loading required package: Biobase
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: 'BiocGenerics'

The following objects are masked from 'package:parallel':

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from 'package:stats':

    IQR, mad, sd, var, xtabs

The following objects are masked from 'package:base':

    Filter, Find, Map, Position, Reduce, anyDuplicated, append,
    as.data.frame, cbind, colMeans, colSums, colnames, do.call,
    duplicated, eval, evalq, get, grep, grepl, intersect, is.unsorted,
    lapply, lengths, mapply, match, mget, order, paste, pmax, pmax.int,
    pmin, pmin.int, rank, rbind, rowMeans, rowSums, rownames, sapply,
    setdiff, sort, table, tapply, union, unique, unsplit, which,
    which.max, which.min

Welcome to Bioconductor

    Vignettes contain introductory material; view with
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.

hopach documentation built on Nov. 8, 2020, 4:54 p.m.