wam-function: Calculate word association measure according to various...

Description Usage Arguments Value Author(s) References See Also Examples

Description

Word association measures give an information about the tendency of two words to co-occurr with greater (or lesser) than chance frequency.

t-score = prob(X=chemistry, Y=physics) - ( prob(X=chemistry) prob(Y=physics) ) ---------------------------------------------------------------------- sqrt((1/T) prob(X=chemistry, Y=physics))

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
wam.jaccard(N, n, K, k)

wam.MI(N, n, K, k)

wam.frequency(N, n, K, k)

wam.loglikelihood(N, n, K, k, p.value = FALSE, two.sided = TRUE)

wam.collostruction(N, n, K, k)

wam.specificities(N, n, K, k, method = "log")

wam.z(N, n, K, k, yates.correction = FALSE)

wam.t(N, n, K, k)

wam.chisq(
  N,
  n,
  K,
  k,
  yates.correction = TRUE,
  p.value = TRUE,
  two.sided = FALSE
)

wam.ar(N, n, K, k)

wam.g(N, n, K, k)

Arguments

N

numeric vector, the total number of occurrences in the corpus

n

numeric vector, The frequency of word 1

K

numeric vector, The frequency of word 2

k

numeric vector, The frequency of the cooccurrence or word 1 and word 2.

p.value

length 1 numeric vector

two.sided

length 1 numeric vector

method

character: one of log, logscale, base, logscale, gap

yates.correction

length 1 numeric vector

Value

a numeric vector of association strength

a numeric vector of association strength

a numeric vector of association strength

a numeric vector of association strength

a numeric vector of association strength

a numeric vector of association strength

a numeric vector of association strength

a numeric vector of association strength

a numeric vector of association strength

a numeric vector of association strength

a numeric vector of association strength

Author(s)

Bernard Desgraupes

Sylvain Loiseau

References

Dunning, T. 1993. « Accurate methods for the statistics of surprise and coincidence ». In: Computational linguistics. 19/1. MIT Press, pp. 61-74

Stefanowitsch A. \& Gries St. Th. 2003 "Collostructions: Investigating the interaction of words and constructions", International Journal of Corpus Linguistics, 8/2, 209-234. http://www.anglistik.uni-muenchen.de/personen/professoren/schmid/schmid_publ/collostructional-analysis.pdf

Lafon, P. (1980). « Sur la variabilité de la fréquence des formes dans un corpus ». In: Mots. 1. pp. 127–165. http://www.persee.fr/web/revues/home/prescript/article/mots_0243-6450_1980_num_1_1_1008

Church, K., Gale, W., Hanks, P. & Hindle, D., 1991. « Using Statistics in Lexical Analysis ». In : Zernik, U. (ed.), Lexical Acquisition. Hillsdale, Lawrence Erlbaum Ass., pp. 115–164.

Church, K., Gale, W., Hanks, P. & Hindle, D., 1991. « Using Statistics in Lexical Analysis ». In : Zernik, U. (ed.), Lexical Acquisition. Hillsdale, Lawrence Erlbaum Ass., pp.

Kenneth Ward Church, Patrick Hanks (1990) "Word Association Norms, Mutual Information, and Lexicography" Computational Linguistics, 16/1, pages 22-29. http://www.aclweb.org/anthology/P89-1010.pdf

See Also

see wam for another, higher level interface.

Examples

1
2
3

sylvainloiseau/wam documentation built on Feb. 12, 2020, 12:30 a.m.