TSS: The Use of Inflected or Uninflected Determiners in the...
In corregp: Functions and Methods for Correspondence Regression

Description Format Source Examples

The distribution of the Belgian Dutch -e(n)-suffix with 14 determiners in 14 registers and for several speaker characteristics.

A data frame with 40778 rows and 13 variables.

Variant The linguistic variant used in a set of alternatives (35 levels).
Variable The linguistic variable specifying a set of alternatives (14 levels).
Inflected Numeric variable specifying whether the linguistic variant is inflected (1) or not (0).
Register The register of the data in the Spoken Dutch Corpus (14 levels: see here for their definition).
Register2 The dichotomization of Register into private and public.
SpeakerID The ID of the speaker in the Spoken Dutch Corpus (1144 levels).
Region The region in which the speaker lived until the age of 18 (4 levels).
Sex The sex of the speaker (2 levels).
BirthYear The year in which the speaker was born (63 levels).
Decade The decade in which the speaker was born (7 levels).
Generation The generation cohort in which the speaker was born (5 levels).
Education The level of education of the speaker (3 levels).
Occupation The level of occupation of the speaker (10 levels: see here for their definition).

Plevoets, K. (2008) Tussen spreek- en standaardtaal. Leuven, Doctoral dissertation. Available online here.

data(TSS)
# The execution of corregp may be slow, due to bootstrapping:  
tss.crg <- corregp(Variant ~ Register2 * Region, data = TSS, part = "Variable", b = 3000)
tss.crg
summary(tss.crg, parm = "b", add_ci = TRUE)
screeplot(tss.crg, add_ci = TRUE)
anova(tss.crg, nf = 2)
tss.col <- ifelse( xtabs(~ Variant + Inflected, data = TSS)[, 1] > 0, "blue", "red")
plot(tss.crg, x_ell = TRUE, xsub = c("Register2", "Region"), col_btm = tss.col, col_top = "black")