Compute a distance metric between two partitions of a set
Description
Given a set partitioned in two ways, compute a distance metric between the partitions.
Usage
1  partitionMetric(B, C, beta = 2)

Arguments
B 
B and C are vectors that represents partitions of a single set, with each element representing a member of the set. B(i) corresponds to C(i), and the two vectors must be the same length. The data types of B and C must be identical and convertable to a factor data type. See examples below for more information. 
C 
See B above. 
beta 
Beta is the nonlinear parameter used to compute the distance metric. See the publication referenced below for full details. 
Value
The return value is a nonnegative real number representing the distance between the two partition of the set. Full details are in the paper referenced below.
Author(s)
David Weisman, Dan Simovici
References
David Weisman and Dan Simovici, Several Remarks on the Metric Space of Genetic Codes. International Journal of Data Mining and Bioinformatics, 2012(6).
See Also
as.dist
, hclust
Examples
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43  ## Define several partitions of a 4element set
gender < c('boy', 'girl', 'girl', 'boy')
height < c('short', 'tall', 'medium', 'tall')
age < c(7, 6, 5, 4)
## Compute some distances
(dGG < partitionMetric (gender, gender))
(dGH < partitionMetric (gender, height))
(dHG < partitionMetric (height, gender))
(dGA < partitionMetric (gender, age))
(dHA < partitionMetric (height, age))
## These properties must hold for any metric
dGG == 0
dGH == dHG
dGA <= dGH + dHA
## Note that the partition names are irrelevant, and only need to be
## selfconsistent within each B and C. It follows that these two set
## partitions are identical and have distance 0.
partitionMetric (c(1,8,8), c(7,3,3)) == 0
## Use the set partition to measure amino acid acid sequence differences
## between several alleles of the aryl hydrocarbon receptor.
data(AhRs)
dim(AhRs)
AhRs[,1:10]
distanceMatrix <
matrix(nrow=nrow(AhRs), ncol=nrow(AhRs), 0,
dimnames=list(rownames(AhRs), rownames(AhRs)))
for (pair in combn(rownames(AhRs), 2, simplify=FALSE)) {
d < partitionMetric (AhRs[pair[1],], AhRs[pair[2],], beta=1.01)
distanceMatrix[pair[1],pair[2]] < distanceMatrix[pair[2],pair[1]] < d
}
hc < hclust(as.dist(distanceMatrix))
plot(hc,
sub=sprintf('Cophenentic correlation between distances and tree is %0.2f',
cor(as.dist(distanceMatrix), cophenetic(hc))))

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker. Vote for new features on Trello.