compose: Two-Word Composition
In LSAfun: Applied Latent Semantic Analysis (LSA) Functions

compose

R Documentation

Two-Word Composition

Description

Computes the vector of a complex expression p consisting of two single words u and v, following the methods examined in Mitchell & Lapata (2008) (see Details).

Usage

## Default 
compose(x,y,method="Add", a=1,b=1,c=1,m,k,lambda=2,
      tvectors=tvectors, norm="none")

Arguments

`x`	a single word (character vector with `length(x) = 1)`
`y`	a single word (character vector with `length(y) = 1)`
`a`, `b`, `c`	weighting parameters, see Details
`m`	number of nearest words to the Predicate that are initially activated (see `Predication`)
`k`	size of the `k`-neighborhood; `k` `\le` `m` (see `Predication`)
`lambda`	dilation parameter for `method = "Dilation"`
`method`	the composition method to be used (see Details)
`norm`	whether to `normalize` the single word vectors before applying a composition function. Setting `norm = "none"` will not perform any normalizations, setting `norm = "all"` will normalize every involved word vector. Setting `norm = "block"` is only valid for the `Predication` method
`tvectors`	the semantic space in which the computation is to be done (a numeric matrix where every row is a word vector)

Details

Let p be the vector with entries p_i for the two-word phrase consisiting of u with entries u_i and v with entries v_i. The different composition methods as described by Mitchell & Lapata (2008, 2010) are as follows:

Additive Model (method = "Add")

p_i = u_i + v_i
Weighted Additive Model (method = "WeightAdd")

p_i = a*u_i + b*v_i
Multiplicative Model (method = "Multiply")

p_i = u_i * v_i
Combined Model (method = "Combined")

p_i = a*u_i + b*v_i + c*u_i*v_i
Predication (method = "Predication") (see Predication)

If method="Predication" is used, x will be taken as Predicate and y will be taken as Argument of the phrase (see Examples)
Circular Convolution (method = "CConv")

p_i = \sum\limits_{j} u_j * v_{i-j}

,

where the subscripts of v are interpreted modulo n with n = length(x)(= length(y))
Dilation (method = "Dilation")

p = (u*u)*v + (\lambda - 1)*(u*v)*u

,

with (u*u) being the dot product of u and u (and (u*v) being the dot product of u and v).

The Add, Multiply, and CConv methods are symmetrical composition methods,

i.e. compose(x="word1",y="word2") will give the same results as compose(x="word2",y="word1")

On the other hand, WeightAdd, Combined, Predication and Dilation are asymmetrical, i.e. compose(x="word1",y="word2") will give different results than compose(x="word2",y="word1")

Value

The phrase vector as a numeric vector

Author(s)

Fritz Guenther

References

Kintsch, W. (2001). Predication. Cognitive science, 25, 173-202.

Mitchell, J., & Lapata, M. (2008). Vector-based Models of Semantic Composition. In Proceedings of ACL-08: HLT (pp. 236-244). Columbus, Ohio.

Mitchell, J., & Lapata, M. (2010). Composition in Distributional Models of Semantics. Cognitive Science, 34, 1388-1429.

Examples

data(wonderland)

compose(x="mad",y="hatter",method="Add",tvectors=wonderland)

compose(x="mad",y="hatter",method="Combined",a=1,b=2,c=3,
tvectors=wonderland)

compose(x="mad",y="hatter",method="Predication",m=20,k=3,
tvectors=wonderland)

compose(x="mad",y="hatter",method="Dilation",lambda=3,
tvectors=wonderland)

LSAfun documentation built on April 4, 2025, 1:44 a.m.