Description Usage Arguments Details Value Examples
This method shows the value of the pairwise configuration.
and configures the pairwise measure to compute semantic similarity between two concepts of a given ontology.To set the pairwise measure one of the available short flags described in details should be used.
1 2 3 4 5 6 7 8 9 | pairwiseConfig(object)
pairwiseConfig(object) <- value
## S4 method for signature 'Similarity'
pairwiseConfig(object)
## S4 replacement method for signature 'Similarity'
pairwiseConfig(object) <- value
|
object |
instance of class |
value |
See details |
The following measures can be used to compute semantic similarities between two concepts.
'edge_rada_lca' : Computes the similarity of two concepts based on the shortest path linking the two concepts.
sim(u,v) = 1 /sp(u,v)
'edge_wupalmer': Computes the similarity of two concepts based on the depth of the concepts and the depth of their most specific common ancestor
sim(u,v) = depth(MSCA[u,v]) / (depth(u) + depth(v))
'edge_resnik': Computes the similarity of two concepts based on the shortest path between the concepts and the maximum depth of the taxonomy
(2 * max_depth - min_sp(u,v)) / (2 * max_depth)
max_depth is the maximum depth in the ontology
sp(u,v) is the shortest path legnth between u and v
'edge_leachod': Computes the similarity of two concepts based on the shortest path as Rada but also considering the depth of the ontology
sim(u,v) = -log( (sp(u,v) + 1) / 2 * max_depth )
'edge_slimani': Computes the similarity of two concepts based on the depth of the most specific common ancesto and the max depth of the concepts
sim(u,v) = 2 * depth(MCA) / ((depth(u) + depth(v) + 1) * pf ))
depth(MCA) is the maximum depth of the most common ancestor of the concepts
pf is a penalization factor used when concepts belong to the same hierarchy
The following measures require the specification of an additional meausre to compute the information content of nodes.
'lin': Computes the similarity between two concepts based on the information content of the two concepts and the information content of the most informative common ancestor of the two concepts
sim(u, v) = (2 * IC(MICA)) / ( IC(u) + IC(v) )
IC(MICA) is the information content of the most informative common ancestor of u and v. MICA is the concept in the ancestors of both u and v that maximizes the Information Content measure.
'resnik': Computes the similarity between two concepts based on the information content of the most informative common ancestors of the compared concepts
sim(u,v) = IC(MICA)
'schlicker': Computes the similarity between two concepts based on the information concent of the most informative common ancestor of the compared concepts and its probability of occurrence
sim(u,v) = (2 * IC(MICA)) / ( IC(u) + IC(v)) * (1 - Prob_MICA)
Prob_MICA is the probability of occurrence of the most informative common ancestor of the compared concepts
'jaccard': Computes the similarity between two concepts based on the information content of the most informative common ancestor.
sim(u, v) = IC(MICA) / (IC(u) + IC(v) - IC(MICA)) if the sum of the IC of the concepts is different from the IC of the MICA else sim(u, v) = 0.
'sim': This measure is based on lin
similarity
sim(u, v) = lin(u, v) - (1 - (1 / (1+ IC(MICA))))
'jc_norm': Computes the similarity between two concepts based on the IC of the most informative ancestor of the concpets
sim(u,v) = 1 - (IC(u) + IC(v) - 2 * IC(MICA)) / 2
Information content based measures require the configuration parameter for estimating concept specificity. Intrinsic estimation uses the topological properties of the taxonomic backbone of the semantic graph. There are different options:
'zhou': Intrinsic estimation of the specificity of the concepts based on their depth in the ontology.
IC(c) = k( 1 - log(D(c))/log(|C|)) + (1 - k) (log(max(depth(x)))/ log(depth_max))
k is a factor to adjust the weight of the two items of the equation
D(c) is the number of hyponims of concept c
|C| is the number of concepts in the ontology
depth(c) is the maximum depth of concept c
depth_max is the maximum depth in the ontology
'resnik_1995': Intrinsic estimation of the specificity of concepts based on the number of ancestors of the concept.
IC(c) = |A(c)|
'seco'Intrinsic estimation of the specificity of the concepts based on the number of concepts they subsume.
IC(c) = 1 - ( log(D(c) / log(|C|) )
D(c) is the number of hyponims of concept c
|C| is the number of concepts in the ontology
'sanchez': Intrinsic estimation of the specificity of the concepts based on the number of leaves and the number of subsumers of the concepts
IC(c) = -log(x / nb_leaves + 1) with x = |leaves(c)| / |A(c)|
nb_leaves is the represents the number of leaves corresponding to the root node of the hierarchy
leaves(c) is the number of leaves corresponding to the concept c
|A(c)| is the number of concepts that subsume c
'anc_norm': Intrinsic estimation of the specificity of concepts based on the number of ancestors of a given concept normalized on the number of concepts in the ontology.
'depth_min_non_linear': Intrinsic estimation of the specificity of concepts based on their minimum depth.
'depth_max_non_linear': Intrinsic estimation of the specificity of concepts based on their maximum depth.
The pairwise measure
instance of the Similarity class with the new pairwise option.
1 2 3 4 5 6 7 8 9 10 | sim <- new('Similarity')
obo <- system.file('extdata', 'sample.cs.obo', package='OnassisJavaLibs')
ontology(sim) <- obo
pairwiseConfig(sim)
sim <- new('Similarity')
obo <- system.file('extdata', 'sample.cs.obo', package='OnassisJavaLibs')
ontology(sim) <- obo
pairwiseConfig(sim) <- 'edge_resnik'
#The following configuration uses an information content based measure
pairwiseConfig(sim) <- c('resnik', 'seco')
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.