Description Usage Format References Examples

Regular and irregular Dutch verbs and selected lexical and distributional properties.

1 |

A data frame with 700 observations on the following 13 variables.

`Verb`

a factor with the verbs as levels.

`WrittenFrequency`

a numeric vector of logarithmically transformed frequencies in written Dutch (as available in the CELEX lexical database).

`NcountStem`

a numeric vector for the number of orthographic neighbors.

`VerbalSynsets`

a numeric vector for the number of verbal synsets in WordNet.

`MeanBigramFrequency`

a numeric vector for mean log bigram frequency.

`InflectionalEntropy`

a numeric vector for Shannon's entropy calculated for the word's inflectional variants.

`Auxiliary`

a factor with levels

`hebben`

,`zijn`

and`zijnheb`

for the verb's auxiliary in the perfect tenses.`Regularity`

a factor with levels

`irregular`

and`regular`

.`LengthInLetters`

a numeric vector of the word's orthographic length.

`FamilySize`

a numeric vector for the number of types in the word's morphological family.

`Valency`

a numeric vector for the verb's valency, estimated by its number of argument structures.

`NVratio`

a numeric vector for the log-transformed ratio of the nominal and verbal frequencies of use.

`WrittenSpokenRatio`

a numeric vector for the log-transformed ratio of the frequencies in written and spoken Dutch.

Baayen, R. H. and Moscoso del Prado Martin, F. (2005) Semantic density and past-tense formation in three Germanic languages, Language, 81, 666-698.

Tabak, W., Schreuder, R. and Baayen, R. H. (2005) Lexical statistics and
lexical processing: semantic density, information complexity, sex, and
irregularity in Dutch, in Kepser, S. and Reis, M., *Linguistic Evidence -
Empirical, Theoretical, and Computational Perspectives*, Berlin: Mouton de
Gruyter, pp. 529-555.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 | ```
## Not run:
data(regularity)
# ---- predicting regularity with a logistic regression model
library(rms)
regularity.dd = datadist(regularity)
options(datadist = 'regularity.dd')
regularity.lrm = lrm(Regularity ~ WrittenFrequency +
rcs(FamilySize, 3) + NcountStem + InflectionalEntropy +
Auxiliary + Valency + NVratio + WrittenSpokenRatio,
data = regularity, x = TRUE, y = TRUE)
anova(regularity.lrm)
# ---- model validation
validate(regularity.lrm, bw = TRUE, B = 200)
pentrace(regularity.lrm, seq(0, 0.8, by = 0.05))
regularity.lrm.pen = update(regularity.lrm, penalty = 0.6)
regularity.lrm.pen
# ---- a plot of the partial effects
plot(Predict(regularity.lrm.pen))
# predicting regularity with a support vector machine
library(e1071)
regularity$AuxNum = as.numeric(regularity$Auxiliary)
regularity.svm = svm(regularity[, -c(1,8,10)], regularity$Regularity, cross=10)
summary(regularity.svm)
## End(Not run)
``` |

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.