Description Usage Arguments Details Value Warning Author(s) References See Also Examples
Given a matrix of pair wise distances based on a choice of distance metric, correlationordering
computes the empirical correlation (over all pairs of elements) between the distance apart in the rows/columns of the matrix and the distance according to the metric. Correlation ordering will be high if elements close to each other in the matrix have small pair wise distances. If the rows/columns of the distance matrix are ordered according to a clustering of the elements, then correlation ordering should be large compared to a matrix with randomly ordered rows/columns.
1 2 3 |
dist |
matrix of all pair wise distances between a set of 'p' elements,
as produced, for example, by the |
echo |
indicator of whether the value of correlation ordering before and after rearranging the ordering should be printed. |
Correlation ordering is defined as the empirical correlation between distance in a list and distance according to some other metric. The value in row 'i' and column 'j' of dist
is compared to 'j-i'. The function correlationordering
computes the correlation ordering for a matrix dist
, whereas the function improveordering
swaps the ordering of elements in dist
until doing so no longer improves correlation ordering. The algorithm for improveordering
is not optimized, so that the function can be quite slow for more than 50 elements. These functions are used by the hopach
clustering function to sensibly order the clusters in the first level of the hierarchical tree, and can also be used to order elements within clusters when the number of elements is not too large.
For correlationordering
, a number between -1 and 1, as returned by the cor
function, equal to the correlation ordering for the matrix dist
.
For improveordering
, a vector of length 'p' containing the row indices for the new ordering of the rows/columns of dist
, so that dist[improveordering(dist)] now has higher correlation ordering.
The function improveordering
can be very slow for more than about 50 elements. The method employed is a greedy, step-wise algorithm, in which sequentially swaps all pairs of elements and accepts any swap that improves correlation ordering.
Katherine S. Pollard <kpollard@gladstone.ucsf.edu> and Mark J. van der Laan <laan@stat.berkeley.edu>
van der Laan, M.J. and Pollard, K.S. A new algorithm for hybrid hierarchical clustering with visualization and the bootstrap. Journal of Statistical Planning and Inference, 2003, 117, pp. 275-303.
http://www.stat.berkeley.edu/~laan/Research/Research_subpages/Papers/hopach.pdf
1 2 3 4 5 6 7 | mydata<-matrix(rnorm(50),nrow=10)
mydist<-distancematrix(mydata,d="euclid")
image(as.matrix(mydist))
correlationordering(mydist)
neword<-improveordering(mydist,echo=TRUE)
correlationordering(mydist[neword,neword])
image(as.matrix(mydist[neword,neword]))
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.