SurfaceColloc | R Documentation |
This data set demonstrates how co-occurrence and marginal frequencies can be provided for collocation analysis with am.score
.
It contains surface co-occurrence counts for 7 English nouns as nodes and 7 selected collocates. The counts are based on a collocational span of two tokens to the left and right of the node (L2/R2) in the WP500 corpus.
Marginal frequencies for the nodes are overall corpus frequencies of the nouns, so expected co-occurrence frequency needs to be adjusted with the total span size of 4 tokens.
SurfaceColloc
A list with the following components:
cooc
:A data frame with 34 rows and the following columns:
w1
: node word (noun)
w2
: collocate
f
: co-occurrence frequency within L2/R2 span
f1
:Labelled integer vector of length 7 specifying the marginal frequencies of the node nouns.
f2
:Labelled integer vector of length 7 specifying the marginal frequencies of the collocates.
N
:Sample size, i.e. the total number of tokens in the WP500 corpus.
Stephanie Evert (https://purl.org/stephanie.evert)
am.score
head(SurfaceColloc$cooc, 10)
SurfaceColloc$f1
SurfaceColloc$f2
SurfaceColloc$N
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.