Compute various asymmetric similarities between words

1 |

`x` |
A single word, given as a character of |

`y` |
A single word, given as a character of |

`method` |
Specifying the formula to use for asymmetric similarity computation |

`t` |
A numeric threshold a dimension value of the vectors has to exceed so that the dimension is considered |

`tvectors` |
the semantic space in which the computation is to be done (a numeric matrix where every row is a word vector) |

`breakdown` |
if |

Asymmetric (or directional) similarities can be useful e.g. for examining *hypernymy* (category inclusion), for example the relation between *dog* and *animal* should be asymmetrical. The general idea is that, if one word is a hypernym of another (i.e. it is semantically narrower), then a significant number of dimensions that are salient in this word should also be salient in the semantically broader term (Lenci & Benotto, 2012).

In the formulas below, *w_x(f)* denotes the value of vector *x* on dimension *f*. Furthermore, *F_x* is the set of *active* dimensions of vector *x*. A dimension *f* is considered active if
*w_x(f) > t*, with *t* being a pre-defined, free parameter.

The options for `method`

are defined as follows (see Kotlerman et al., 2010) (1)):

`method = "weedsprec"`

*weedsprec(u,v) = \frac{∑\nolimits_{f \in F_u \cap F_v}w_u(f)}{∑\nolimits_{f \in F_u}w_u(f)}*`method = "cosweeds"`

*cosweeds(u,v) = √{weedsprec(u,v) \times cosine(u,v)}*`method = "clarkede"`

*clarkede(u,v) = \frac{∑\nolimits_{f \in F_u \cap F_v}min(w_u(f),w_v(f))}{∑\nolimits_{f \in F_u}w_u(f)}*`method = "invcl"`

*invcl(u,v) = √{clarkede(u,v)\times(1-clarkede(u,v)})*`method = "kintsch"`

Unlike the other methods, this one is not derived from the logic of hypernymy, but rather from asymmetrical similarities between words due to different amounts of knowledge about them. Here, asymmteric similarities between two words are computed by taking into account the vector length (i.e. the amount of information about those words). This is done by projecting one vector onto the other, and normalizing this resulting vector by dividing its length by the length of the longer of the two vectors (Details in Kintsch, 2014, see References).

A numeric giving the asymmetric similarity between `x`

and `y`

Fritz Günther

Kintsch, W. (2015). Similarity as a Function of Semantic Distance and Amount of Knowledge. *Psychological Review, 121,* 559-561.

Kotlerman, L., Dagan, I., Szpektor, I., & Zhitomirsky-Geffet, M (2010). Directional distributional
similarity for lexical inference. *Natural Language Engineering, 16,* 359-389.

Lenci, A., & Benotto, G. (2012). Identifying hypernyms in distributional semantic spaces. In *Proceedings of *SEM* (pp. 75-79), Montreal, Canada.

1 2 3 4 | ```
data(wonderland)
asym("alice","girl",method="cosweeds",t=0,tvectors=wonderland)
asym("alice","rabbit",method="cosweeds",tvectors=wonderland)
``` |

Questions? Problems? Suggestions? Tweet to @rdrrHQ or email at ian@mutexlabs.com.

All documentation is copyright its authors; we didn't write any of that.