The Use of Inflected or Uninflected Determiners in the Belgian Dutch Vernacular

Share:

Description

The distribution of the Belgian Dutch -e(n)-suffix with 14 determiners in 14 registers and for several speaker characteristics.

Format

A data frame with 40778 rows and 12 variables.

  • Variant The linguistic variant used in a set of alternatives (35 levels).

  • Variable The linguistic variable specifying a set of alternatives (14 levels).

  • Inflected Numeric variable specifying whether the linguistic variant is inflected (1) or not (0).

  • Register The register of the data in the Spoken Dutch Corpus (14 levels: see here for their definition).

  • SpeakerID The ID of the speaker in the Spoken Dutch Corpus (1144 levels).

  • Region The region in which the speaker lived until the age of 18 (4 levels).

  • Sex The sex of the speaker (2 levels).

  • BirthYear The year in which the speaker was born (63 levels).

  • Decade The decade in which the speaker was born (7 levels).

  • Generation The generation cohort in which the speaker was born (5 levels).

  • Education The level of education of the speaker (3 levels).

  • Occupation The level of occupation of the speaker (10 levels: see here for their definition).

Source

Plevoets, K. (2008) Tussen spreek- en standaardtaal. Leuven, Doctoral dissertation. Available online here and here.

Examples

1
2
3
4
5
6
7
8
data(TSS)
# The execution of corregp may be slow, due to bootstrapping:
tss.crg <- corregp(Variant ~ Register * Region, data = TSS, part = "Variable", b = 3000)
tss.crg
summary(tss.crg, parm = "b", add_ci = TRUE)
screeplot(tss.crg, add_ci = TRUE)
tss.col <- ifelse( xtabs(~ Variant + Inflected, data = TSS)[, 1] > 0, "blue", "red")
plot(tss.crg, x_ell = TRUE, xsub = c("Register", "Region"), col_btm = tss.col, col_top = "black")