utilities: Utility functions for converting between "contingency" and...
In sylvainloiseau/wam: Word Association Measures

Description Usage Arguments Value Author(s) Examples

The four arguments used in word attraction measure can be given in two forms in the litterature:

- either as the four values (C11, C12, C21, C22) of a contingency table:

	word1	¬word1
word2	C11	C12
¬word2	C21	C22

where : - C11 is the number of occurrences in context A and B (eg. lexem = 'x' and construction = 'be x') - C12 is the number of occurrences in context A and not B (eg. lexem = 'x' and construction != 'be x') - C21 is the number of occurrences not in context A and in B (eg. lexem != 'x' and construction = 'be x') - C22 is the number of occurrences not in context A and not in context B (eg. lexem != 'x' and construction != 'be x')

- or using marginal total :

	word1	¬word1	Total
word2	k		n
¬word2
Total	K		N

where : - N The total number of occurrences in the corpus - n The number of occurrence in the subcorpora - K The total frequency of the form in the corpus - k The subfrequency of the form in the subcorpora

These utility functions help converting between these two forms.

marginal(contingency)

contingency(marginal)

cont2vec(cont)

`contingency`	a data frame containing columns named C11, C12, C21 and C22
`marginal`	a data frame with columns named N, n, K, k.
`cont`	a 2 * 2 contingency table

a data frame with columns named N, n, K, k.

a data frame with four columns named after the four arguments.

a data frame with four columns named C11, C12, C21, C22

Sylvain Loiseau

data(robespierre)
peuple_D4 <- robespierre[robespierre$types=="peuple" & robespierre$parts == "D4",]
peuple_D4
res <- contingency(peuple_D4)


data(happen)
happen
happen.vec <- cont2vec(happen)
happen.vec
happen.mar <- marginal(happen.vec)
happen.mar
res <- do.call(wam.collostruction, as.list(happen.mar))
res