compose: Two-Word Composition

View source: R/compose.r

composeR Documentation

Two-Word Composition

Description

Computes the vector of a complex expression p consisting of two single words u and v, following the methods examined in Mitchell & Lapata (2008) (see Details).

Usage

## Default 
compose(x,y,method="Add", a=1,b=1,c=1,m,k,lambda=2,
      tvectors=tvectors, norm="none")

Arguments

x

a single word (character vector with length(x) = 1)

y

a single word (character vector with length(y) = 1)

a,b,c

weighting parameters, see Details

m

number of nearest words to the Predicate that are initially activated (see Predication)

k

size of the k-neighborhood; k \le m (see Predication)

lambda

dilation parameter for method = "Dilation"

method

the composition method to be used (see Details)

norm

whether to normalize the single word vectors before applying a composition function. Setting norm = "none" will not perform any normalizations, setting norm = "all" will normalize every involved word vector. Setting norm = "block" is only valid for the Predication method

tvectors

the semantic space in which the computation is to be done (a numeric matrix where every row is a word vector)

Details

Let p be the vector with entries p_i for the two-word phrase consisiting of u with entries u_i and v with entries v_i. The different composition methods as described by Mitchell & Lapata (2008, 2010) are as follows:

  • Additive Model (method = "Add")

    p_i = u_i + v_i

  • Weighted Additive Model (method = "WeightAdd")

    p_i = a*u_i + b*v_i

  • Multiplicative Model (method = "Multiply")

    p_i = u_i * v_i

  • Combined Model (method = "Combined")

    p_i = a*u_i + b*v_i + c*u_i*v_i

  • Predication (method = "Predication") (see Predication)

    If method="Predication" is used, x will be taken as Predicate and y will be taken as Argument of the phrase (see Examples)

  • Circular Convolution (method = "CConv")

    p_i = \sum\limits_{j} u_j * v_{i-j}

    ,

    where the subscripts of v are interpreted modulo n with n = length(x)(= length(y))

  • Dilation (method = "Dilation")

    p = (u*u)*v + (\lambda - 1)*(u*v)*u

    ,

    with (u*u) being the dot product of u and u (and (u*v) being the dot product of u and v).

The Add, Multiply, and CConv methods are symmetrical composition methods,

i.e. compose(x="word1",y="word2") will give the same results as compose(x="word2",y="word1")

On the other hand, WeightAdd, Combined, Predication and Dilation are asymmetrical, i.e. compose(x="word1",y="word2") will give different results than compose(x="word2",y="word1")

Value

The phrase vector as a numeric vector

Author(s)

Fritz Guenther

References

Kintsch, W. (2001). Predication. Cognitive science, 25, 173-202.

Mitchell, J., & Lapata, M. (2008). Vector-based Models of Semantic Composition. In Proceedings of ACL-08: HLT (pp. 236-244). Columbus, Ohio.

Mitchell, J., & Lapata, M. (2010). Composition in Distributional Models of Semantics. Cognitive Science, 34, 1388-1429.

See Also

Predication

Examples

data(wonderland)

compose(x="mad",y="hatter",method="Add",tvectors=wonderland)

compose(x="mad",y="hatter",method="Combined",a=1,b=2,c=3,
tvectors=wonderland)

compose(x="mad",y="hatter",method="Predication",m=20,k=3,
tvectors=wonderland)

compose(x="mad",y="hatter",method="Dilation",lambda=3,
tvectors=wonderland)

LSAfun documentation built on Nov. 18, 2023, 1:10 a.m.