SurfaceColloc: A small data set of surface collocations from the English...
In corpora: Statistics and Data Sets for Corpus Frequency Data

SurfaceColloc

R Documentation

A small data set of surface collocations from the English Wikipedia

Description

This data set demonstrates how co-occurrence and marginal frequencies can be provided for collocation analysis with am.score. It contains surface co-occurrence counts for 7 English nouns as nodes and 7 selected collocates. The counts are based on a collocational span of two tokens to the left and right of the node (L2/R2) in the WP500 corpus. Marginal frequencies for the nodes are overall corpus frequencies of the nouns, so expected co-occurrence frequency needs to be adjusted with the total span size of 4 tokens.

Usage


SurfaceColloc

Format

A list with the following components:

cooc:

A data frame with 34 rows and the following columns:

w1: node word (noun)
w2: collocate
f: co-occurrence frequency within L2/R2 span

f1:

Labelled integer vector of length 7 specifying the marginal frequencies of the node nouns.

f2:

Labelled integer vector of length 7 specifying the marginal frequencies of the collocates.

N:

Sample size, i.e. the total number of tokens in the WP500 corpus.

Author(s)

Stephanie Evert (https://purl.org/stephanie.evert)

Examples

head(SurfaceColloc$cooc, 10)
SurfaceColloc$f1
SurfaceColloc$f2
SurfaceColloc$N

corpora documentation built on June 10, 2025, 3:01 a.m.

corpora index

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

corpora
Statistics and Data Sets for Corpus Frequency Data

SurfaceColloc: A small data set of surface collocations from the English...
In corpora: Statistics and Data Sets for Corpus Frequency Data

A small data set of surface collocations from the English Wikipedia

Description

Usage

Format

Author(s)

See Also

Examples

Related to SurfaceColloc in corpora...

R Package Documentation

Browse R Packages

We want your feedback!

corpora Statistics and Data Sets for Corpus Frequency Data

SurfaceColloc: A small data set of surface collocations from the English... In corpora: Statistics and Data Sets for Corpus Frequency Data

A small data set of surface collocations from the English Wikipedia

Description

Usage

Format

Author(s)

See Also

Examples

Related to SurfaceColloc in corpora...

R Package Documentation

Browse R Packages

We want your feedback!

corpora
Statistics and Data Sets for Corpus Frequency Data

SurfaceColloc: A small data set of surface collocations from the English...
In corpora: Statistics and Data Sets for Corpus Frequency Data