regularity: Regular and irregular Dutch verbs
In languageR: Data Sets and Functions with Analyzing Linguistic Data: A Practical Introduction to Statistics

regularity

R Documentation

Regular and irregular Dutch verbs

Description

Regular and irregular Dutch verbs and selected lexical and distributional properties.

Usage

data(regularity)

Format

A data frame with 700 observations on the following 13 variables.

Verb: a factor with the verbs as levels.
WrittenFrequency: a numeric vector of logarithmically transformed frequencies in written Dutch (as available in the CELEX lexical database).
NcountStem: a numeric vector for the number of orthographic neighbors.
VerbalSynsets: a numeric vector for the number of verbal synsets in WordNet.
MeanBigramFrequency: a numeric vector for mean log bigram frequency.
InflectionalEntropy: a numeric vector for Shannon's entropy calculated for the word's inflectional variants.
Auxiliary: a factor with levels hebben, zijn and zijnheb for the verb's auxiliary in the perfect tenses.
Regularity: a factor with levels irregular and regular.
LengthInLetters: a numeric vector of the word's orthographic length.
FamilySize: a numeric vector for the number of types in the word's morphological family.
Valency: a numeric vector for the verb's valency, estimated by its number of argument structures.
NVratio: a numeric vector for the log-transformed ratio of the nominal and verbal frequencies of use.
WrittenSpokenRatio: a numeric vector for the log-transformed ratio of the frequencies in written and spoken Dutch.

References

Baayen, R. H. and Moscoso del Prado Martin, F. (2005) Semantic density and past-tense formation in three Germanic languages, Language, 81, 666-698.

Tabak, W., Schreuder, R. and Baayen, R. H. (2005) Lexical statistics and lexical processing: semantic density, information complexity, sex, and irregularity in Dutch, in Kepser, S. and Reis, M., Linguistic Evidence - Empirical, Theoretical, and Computational Perspectives, Berlin: Mouton de Gruyter, pp. 529-555.

Examples

## Not run: 
data(regularity)

# ---- predicting regularity with a logistic regression model

library(rms)
regularity.dd = datadist(regularity)
options(datadist = 'regularity.dd')

regularity.lrm = lrm(Regularity ~ WrittenFrequency + 
rcs(FamilySize, 3) + NcountStem + InflectionalEntropy + 
Auxiliary + Valency + NVratio + WrittenSpokenRatio, 
data = regularity, x = TRUE, y = TRUE)

anova(regularity.lrm)

# ---- model validation

validate(regularity.lrm, bw = TRUE, B = 200)
pentrace(regularity.lrm, seq(0, 0.8, by = 0.05))
regularity.lrm.pen = update(regularity.lrm, penalty = 0.6)
regularity.lrm.pen

# ---- a plot of the partial effects

plot(Predict(regularity.lrm.pen))

# predicting regularity with a support vector machine

library(e1071)
regularity$AuxNum = as.numeric(regularity$Auxiliary)
regularity.svm = svm(regularity[, -c(1,8,10)], regularity$Regularity, cross=10)
summary(regularity.svm)

## End(Not run)

languageR documentation built on June 10, 2025, 9:08 a.m.

languageR index

Package overview

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

languageR
Data Sets and Functions with Analyzing Linguistic Data: A Practical Introduction to Statistics

regularity: Regular and irregular Dutch verbs
In languageR: Data Sets and Functions with Analyzing Linguistic Data: A Practical Introduction to Statistics

Regular and irregular Dutch verbs

Description

Usage

Format

References

Examples

Related to regularity in languageR...

R Package Documentation

Browse R Packages

We want your feedback!

languageR Data Sets and Functions with Analyzing Linguistic Data: A Practical Introduction to Statistics

regularity: Regular and irregular Dutch verbs In languageR: Data Sets and Functions with Analyzing Linguistic Data: A Practical Introduction to Statistics

Regular and irregular Dutch verbs

Description

Usage

Format

References

Examples

Related to regularity in languageR...

R Package Documentation

Browse R Packages

We want your feedback!

languageR
Data Sets and Functions with Analyzing Linguistic Data: A Practical Introduction to Statistics

regularity: Regular and irregular Dutch verbs
In languageR: Data Sets and Functions with Analyzing Linguistic Data: A Practical Introduction to Statistics