germaparl_by_year: Table with information on GermaParl by year

Description Usage Format Details Value

Description

A dataset with information on the corpus on a year-by-year basis is included in the package to be included in the data report of the package vignette.

Usage

1

Format

A data.frame with 22 rows and 6 variables with summary statistics on the GermaParl corpus on a year-by-year basis.

year

year reported on in the row (integer value)

protocols

total number of protocols included in the corpus for the respective year (integer value)

txt

number of protocols prepared based on plain text versions of the protocols (integer value)

pdf

number of protocols prepared based on pdf versions of the protocols (integer value)

size

number of tokens in subcorpus for the respective year (integer value)

unknown

share of words that cannot be lemmatized, resulting in #unknown# tag (numeric value)

Details

The table is based on v1.0.6 of the corpus. The prepare the table, the script available at data-raw/stats_for_vignette.R has been used.

Value

A data.frame.


GermaParl documentation built on Oct. 23, 2020, 8:27 p.m.