compose: Two-Word Composition In codymarquart/LSAfun2: Applied Latent Semantic Analysis (LSA) Functions (no plotting/rgl)

Description

Computes the vector of a complex expression p consisting of two single words u and v, following the methods examined in Mitchell & Lapata (2008) (see Details).

Usage

 ```1 2 3``` ```## Default compose(x,y,method="Add", a=1,b=1,c=1,m,k,lambda=2, tvectors=tvectors,breakdown=TRUE, norm="none") ```

Arguments

 `x` a single word (character vector with `length(x) = 1)` `y` a single word (character vector with `length(y) = 1)` `a,b,c` weighting parameters, see Details `m` number of nearest words to the Predicate that are initially activated (see `Predication`) `k` size of the `k`-neighborhood; `k` ≤ `m` (see `Predication`) `lambda` dilation parameter for `method = "Dilation"` `method` the composition method to be used (see Details) `norm` whether to `normalize` the single word vectors before applying a composition function. Setting `norm = "none"` will not perform any normalizations, setting `norm = "all"` will normalize every involved word vector. Setting `norm = "block"` is only valid for the `Predication` method `tvectors` the semantic space in which the computation is to be done (a numeric matrix where every row is a word vector) `breakdown` if `TRUE`, the function `breakdown` is applied to the input

Details

Let p be the vector with entries p_i for the two-word phrase consisiting of u with entries u_i and v with entries v_i. The different composition methods as described by Mitchell & Lapata (2008, 2010) are as follows:

• Additive Model (`method = "Add"`)

p_i = u_i + v_i

• Weighted Additive Model (`method = "WeightAdd"`)

p_i = a*u_i + b*v_i

• Multiplicative Model (`method = "Multiply"`)

p_i = u_i * v_i

• Combined Model (`method = "Combined"`)

p_i = a*u_i + b*v_i + c*u_i*v_i

• Predication (`method = "Predication"`)
(see `Predication`)

If `method="Predication"` is used, `x` will be taken as Predicate and `y` will be taken as Argument of the phrase (see Examples)

• Circular Convolution (`method = "CConv"`)

p_i = ∑\limits_{j} u_j * v_{i-j}

,
where the subscripts of v are interpreted modulo n with n = `length(x)`(= `length(y)`)

• Dilation (`method = "Dilation"`)

p = (u*u)*v + (λ - 1)*(u*v)*u

,
with (u*u) being the dot product of u and u (and (u*v) being the dot product of u and v).

The `Add, Multiply,` and `CConv` methods are symmetrical composition methods,
i.e. `compose(x="word1",y="word2")` will give the same results as `compose(x="word2",y="word1")`
On the other hand, `WeightAdd, Combined, Predication` and `Dilation` are asymmetrical, i.e. `compose(x="word1",y="word2")` will give different results than `compose(x="word2",y="word1")`

Value

The phrase vector as a numeric vector

Fritz G?nther

References

Kintsch, W. (2001). Predication. Cognitive science, 25, 173-202.

Mitchell, J., & Lapata, M. (2008). Vector-based Models of Semantic Composition. In Proceedings of ACL-08: HLT (pp. 236-244). Columbus, Ohio.

Mitchell, J., & Lapata, M. (2010). Composition in Distributional Models of Semantics. Cognitive Science, 34, 1388-1429.

`Predication`
 ``` 1 2 3 4 5 6 7 8 9 10 11 12``` ```data(wonderland) compose(x="mad",y="hatter",method="Add",tvectors=wonderland) compose(x="mad",y="hatter",method="Combined",a=1,b=2,c=3, tvectors=wonderland) compose(x="mad",y="hatter",method="Predication",m=20,k=3, tvectors=wonderland) compose(x="mad",y="hatter",method="Dilation",lambda=3, tvectors=wonderland) ```