pairwiseConfig: 'pairwiseConfig'
In Onassis: OnASSIs Ontology Annotation and Semantic SImilarity software

Description Usage Arguments Details Value Examples

This method shows the value of the pairwise configuration.

and configures the pairwise measure to compute semantic similarity between two concepts of a given ontology.To set the pairwise measure one of the available short flags described in details should be used.

pairwiseConfig(object)

pairwiseConfig(object) <- value

## S4 method for signature 'Similarity'
pairwiseConfig(object)

## S4 replacement method for signature 'Similarity'
pairwiseConfig(object) <- value

`object`	instance of class `Similarity-class`
`value`	See details

The following measures can be used to compute semantic similarities between two concepts.

'edge_rada_lca' : Computes the similarity of two concepts based on the shortest path linking the two concepts.

sim(u,v) = 1 /sp(u,v)
'edge_wupalmer': Computes the similarity of two concepts based on the depth of the concepts and the depth of their most specific common ancestor

sim(u,v) = depth(MSCA[u,v]) / (depth(u) + depth(v))
'edge_resnik': Computes the similarity of two concepts based on the shortest path between the concepts and the maximum depth of the taxonomy

(2 * max_depth - min_sp(u,v)) / (2 * max_depth)

max_depth is the maximum depth in the ontology

sp(u,v) is the shortest path legnth between u and v
'edge_leachod': Computes the similarity of two concepts based on the shortest path as Rada but also considering the depth of the ontology

sim(u,v) = -log( (sp(u,v) + 1) / 2 * max_depth )
'edge_slimani': Computes the similarity of two concepts based on the depth of the most specific common ancesto and the max depth of the concepts

sim(u,v) = 2 * depth(MCA) / ((depth(u) + depth(v) + 1) * pf ))

depth(MCA) is the maximum depth of the most common ancestor of the concepts

pf is a penalization factor used when concepts belong to the same hierarchy

The following measures require the specification of an additional meausre to compute the information content of nodes.

'lin': Computes the similarity between two concepts based on the information content of the two concepts and the information content of the most informative common ancestor of the two concepts

sim(u, v) = (2 * IC(MICA)) / ( IC(u) + IC(v) )

IC(MICA) is the information content of the most informative common ancestor of u and v. MICA is the concept in the ancestors of both u and v that maximizes the Information Content measure.
'resnik': Computes the similarity between two concepts based on the information content of the most informative common ancestors of the compared concepts

sim(u,v) = IC(MICA)
'schlicker': Computes the similarity between two concepts based on the information concent of the most informative common ancestor of the compared concepts and its probability of occurrence

sim(u,v) = (2 * IC(MICA)) / ( IC(u) + IC(v)) * (1 - Prob_MICA)

Prob_MICA is the probability of occurrence of the most informative common ancestor of the compared concepts
'jaccard': Computes the similarity between two concepts based on the information content of the most informative common ancestor.

sim(u, v) = IC(MICA) / (IC(u) + IC(v) - IC(MICA)) if the sum of the IC of the concepts is different from the IC of the MICA else sim(u, v) = 0.
'sim': This measure is based on lin similarity

sim(u, v) = lin(u, v) - (1 - (1 / (1+ IC(MICA))))
'jc_norm': Computes the similarity between two concepts based on the IC of the most informative ancestor of the concpets

sim(u,v) = 1 - (IC(u) + IC(v) - 2 * IC(MICA)) / 2

Information content based measures require the configuration parameter for estimating concept specificity. Intrinsic estimation uses the topological properties of the taxonomic backbone of the semantic graph. There are different options:

'zhou': Intrinsic estimation of the specificity of the concepts based on their depth in the ontology.

IC(c) = k( 1 - log(D(c))/log(|C|)) + (1 - k) (log(max(depth(x)))/ log(depth_max))

k is a factor to adjust the weight of the two items of the equation

D(c) is the number of hyponims of concept c

|C| is the number of concepts in the ontology

depth(c) is the maximum depth of concept c

depth_max is the maximum depth in the ontology
'resnik_1995': Intrinsic estimation of the specificity of concepts based on the number of ancestors of the concept.

IC(c) = |A(c)|
'seco'Intrinsic estimation of the specificity of the concepts based on the number of concepts they subsume.

IC(c) = 1 - ( log(D(c) / log(|C|) )

D(c) is the number of hyponims of concept c

|C| is the number of concepts in the ontology
'sanchez': Intrinsic estimation of the specificity of the concepts based on the number of leaves and the number of subsumers of the concepts

IC(c) = -log(x / nb_leaves + 1) with x = |leaves(c)| / |A(c)|

nb_leaves is the represents the number of leaves corresponding to the root node of the hierarchy

leaves(c) is the number of leaves corresponding to the concept c

|A(c)| is the number of concepts that subsume c
'anc_norm': Intrinsic estimation of the specificity of concepts based on the number of ancestors of a given concept normalized on the number of concepts in the ontology.
'depth_min_non_linear': Intrinsic estimation of the specificity of concepts based on their minimum depth.
'depth_max_non_linear': Intrinsic estimation of the specificity of concepts based on their maximum depth.

The pairwise measure

instance of the Similarity class with the new pairwise option.

sim <- new('Similarity')
obo <- system.file('extdata', 'sample.cs.obo', package='OnassisJavaLibs')
ontology(sim) <- obo
pairwiseConfig(sim)
sim <- new('Similarity')
obo <- system.file('extdata', 'sample.cs.obo', package='OnassisJavaLibs')
ontology(sim) <- obo
pairwiseConfig(sim) <- 'edge_resnik'
 #The following configuration uses an information content based measure
 pairwiseConfig(sim) <- c('resnik', 'seco')