dissimilarity | R Documentation |
Compute the dissimilarity between (ensembles of) relations.
relation_dissimilarity(x, y = NULL, method = "symdiff", ...)
x |
an ensemble of relations (see
|
y |
|
method |
a character string specifying one of the built-in
methods for computing dissimilarity, or a function to be taken as
a user-defined method. If a character string, its lower-cased
version is matched against the lower-cased names of the available
built-in methods using |
... |
further arguments to be passed to methods. |
Available built-in methods are as follows.
"symdiff"
symmetric difference distance.
This computes the cardinality of the symmetric difference of two
relations, i.e., the number of tuples contained in exactly one of
two relations. For preference relations, this coincides with the
Kemeny-Snell metric (Kemeny and Snell, 1962). For linear
orders, it gives Kendall's \tau
metric (Diaconis, 1988).
Can also be referred to as "SD"
.
Only applicable to crisp relations.
"manhattan"
the Manhattan distance between the incidences.
"euclidean"
the Euclidean distance between the incidences.
"CS"
Cook-Seiford distance, a generalization of the
distance function of Cook and Seiford (1978). Let the generalized
ranks of an object a
in the (first) domain of an
endorelation R
be defined as the number of objects b
dominating a
(i.e., for which a R b
and not b R
a
), plus half the number of objects b
equivalent to a
(i.e., for which a R b
and b R a
). For preference
relations, this gives the usual Kendall ranks arranged according
to decreasing preference (and averaged for ties). Then the
generalized Cook-Seiford distance is defined as the l_1
distance between the generalized ranks. For linear orders, this
gives Spearman's footrule metric (Diaconis, 1988).
Only applicable to crisp endorelations.
"CKS"
Cook-Kress-Seiford distance, a generalization of
the distance function of Cook, Kress and Seiford (1986). For each
pair of objects a
and b
in an endorelation R
, we
can have a R b
and not b R a
or vice versa (cases of
“strict preference”), a R b
and b R a
(the case
of “indifference”), or neither a R b
nor b R a
(the case of “incomparability”). (Only the last two are
possible if a = b
.) The distance by Cook, Kress and Seiford
puts indifference as the metric centroid between both preference
cases and incomparability (i.e., indifference is at distance one
from the other three, and each of the other three is at distance
two from the others). The generalized Cook-Kress-Seiford distance
is the paired comparison distance (i.e., a metric) based on these
distances between the four paired comparison cases. (Formula 3 in
the reference must be slightly modified for the generalization
from partial rankings to arbitrary endorelations.)
Only applicable to crisp endorelations.
"score"
score-based distance. This computes
\Delta(s(x), s(y))
for suitable score and distance functions
s
and \Delta
, respectively. These can be specified by
additional arguments score
and Delta
. If
score
is a character string, it is taken as the method for
relation_scores
. Otherwise, if given it must be a
function giving the score function itself. If Delta
is a
number p \ge 1
, the usual l_p
distance is used.
Otherwise, it must be a function giving the distance function.
The defaults correspond to using the default relation scores and
p = 1
, which for linear orders gives Spearman's footrule
distance.
Only applicable to endorelations.
"Jaccard"
Jaccard distance: 1 minus the ratio of the cardinalities of the intersection and the union of the relations.
"PC"
(generalized) paired comparison distance. This
generalizes the symdiff and CKS distances to use a general set of
discrepancies \delta_{kl}
between the possible paired
comparison results with a,b
/b,a
incidences 0/0, 1/0,
0/1, and 1/1 numbered from 1 to 4 (in a preference context with a
\le
encoding, these correspond to incompatibility, strict
<
and >
preference, and indifference), with
\delta_{kl}
the discrepancy between possible results k
and l
. The distance is then obtained as the sum of the
discrepancies from the paired comparisons of distinct objects,
plus half the sum of discrepancies from the comparisons of
identical objects (for which the only possible results are
incomparability and indifference).
The distance is a metric provided that the \delta_{kl}
satisfy the metric conditions (non-negativity and zero iff
k = l
, symmetry and sub-additivity).
The discrepancies can be specified via the additional argument
delta
, either as a numeric vector of length 6 with the
non-redundant values \delta_{21}, \delta_{31}, \delta_{41},
\delta_{32}, \delta_{42}, \delta_{43}
, or as a character string
partially matching one of the following built-in discrepancies
with corresponding parameter vector \delta
:
"symdiff"
symmetric difference distance, with
discrepancy between distinct results two between either
opposite strict preferences or indifference and
incomparability, and one otherwise:
\delta = (1, 1, 2, 2, 1, 1)
(default).
Can also be referred to as "SD"
.
"CKS"
Cook-Kress-Seiford distance, see above:
\delta = (2, 2, 1, 2, 1, 1)
.
"EM"
the distance obtained from the generalization
of the Kemeny-Snell distance for complete rankings to partial
rankings introduced in Emond and Mason (2000). This uses a
discrepancy of two for opposite strict preferences, and one
for all other distinct results:
\delta = (1, 1, 1, 2, 1, 1)
.
"JMB"
the distance with parameters as suggested by
Jabeur, Martel and Ben Khélifa (2004):
\delta = (4/3, 4/3, 4/3, 5/3, 1, 1)
.
"discrete"
the discrete metric on the set of paired
comparison results:
\delta = (1, 1, 1, 1, 1, 1)
.
Only applicable to crisp endorelations.
Methods "symdiff"
, "manhattan"
, "euclidean"
and
"Jaccard"
take an additional logical argument na.rm
: if
true (default: false), tuples with missing memberships are excluded in
the dissimilarity computations.
If y
is NULL
, an object of class dist
containing the dissimilarities between all pairs of elements of
x
. Otherwise, a matrix with the dissimilarities between the
elements of x
and the elements of y
.
W. D. Cook, M. Kress and L. M. Seiford (1986), Information and preference in partial orders: a bimatrix representation. Psychometrika 51/2, 197–207. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1007/BF02293980")}.
W. D. Cook and L. M. Seiford (1978), Priority ranking and consensus formation. Management Science, 24/16, 1721–1732. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1287/mnsc.24.16.1721")}.
P. Diaconis (1988), Group Representations in Probability and Statistics. Institute of Mathematical Statistics: Hayward, CA.
E. J. Emond and D. W. Mason (2000), A new technique for high level decision support. Technical Report ORD Project Report PR2000/13, Operational Research Division, Department of National Defence, Canada.
K. Jabeur, J.-M. Martel and S. Ben Khélifa (2004). A distance-based collective preorder integrating the relative importance of the groups members. Group Decision and Negotiation, 13, 327–349. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1023/B:GRUP.0000042894.00775.75")}.
J. G. Kemeny and J. L. Snell (1962), Mathematical Models in the Social Sciences, chapter “Preference Rankings: An Axiomatic Approach”. MIT Press: Cambridge.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.