This data set compares the frequencies of 60 selected nouns in the written and spoken parts of the British National Corpus, World Edition (BNC). Nouns were chosen from three frequency bands, namely the 20 most frequent nouns in the corpus, 20 nouns with approximately 1000 occurrences, and 20 nouns with approximately 100 occurrences.
See Aston & Burnard (1998) for more information about the BNC, or go to http://www.natcorp.ox.ac.uk/.
A data frame with 61 rows and the following columns:
lemmatised noun (aka stem form)
frequency in the written part of the BNC
frequency in the spoken part of the BNC
In addition to the 60 nouns, the data set contains a column labelled
OTHER, which represents the total frequency of all other nouns
in the BNC. This value is needed in order to calculate the sample
sizes of the written and spoken part for frequency comparison tests.
Stefan Evert <email@example.com>
Aston, Guy and Burnard, Lou (1998). The BNC Handbook. Edinburgh University Press, Edinburgh. See also the BNC homepage at http://www.natcorp.ox.ac.uk/.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.