BNCmeta: Metadata for the British National Corpus (XML edition)

Description Usage Format Author(s) References

Description

This data set provides complete metadata for all 4048 texts of the British National Corpus (XML edition). See Aston & Burnard (1998) for more information about the BNC, or go to http://www.natcorp.ox.ac.uk/.

The data have automatically been extracted from the original BNC source files. Some transformations were applied so that all attribute names and their values are given in a human-readable form. The Perl scripts used in the extraction procedure are available from http://cwb.sourceforge.net/download.php#import.

Usage

1

Format

A data frame with 4048 rows and the columns listed below. Unless specified otherwise, columns are coded as factors.

id:

BNC document ID; character vector

title:

Title of the document; character vector

n_words:

Number of words in the document; integer vector

n_tokens:

Total number of tokens (including punctuation and deleted material); integer vector

n_w:

Number of w-units (words); integer vector

n_c:

Number of c-units (punctuation); integer vector

n_s:

Number of s-units (sentences); integer vector

publication_date:

Publication date

text_type:

Text type

context:

Spoken context

respondent_age:

Age-group of respondent

respondent_class:

Social class of respondent (NRS social grades)

respondent_sex:

Sex of respondent

interaction_type:

Interaction type

region:

Region

author_age:

Author age-group

author_domicile:

Domicile of author

author_sex:

Sex of author

author_type:

Author type

audience_age:

Audience age

domain:

Written domain

difficulty:

Written difficulty

medium:

Written medium

publication_place:

Publication place

sampling_type:

Sampling type

circulation:

Estimated circulation size

audience_sex:

Audience sex

availability:

Availability

mode:

Text mode (written/spoken)

derived_type:

Text class

genre:

David Lee's genre classification

Author(s)

Stefan Evert <stefan.evert@fau.de>

References

Aston, Guy and Burnard, Lou (1998). The BNC Handbook. Edinburgh University Press, Edinburgh. See also the BNC homepage at http://www.natcorp.ox.ac.uk/.


corpora documentation built on May 2, 2019, 4:56 p.m.

Related to BNCmeta in corpora...