Description Usage Format Source Examples
The frequency of the determiner 'het' in the Dutch novel 'Max Havelaar' by Multatuli (Eduard Douwes Dekker), in 99 consecutive text fragments of 1000 tokens each.
1 |
A data frame with 99 observations on the following 2 variables.
Chunk
a numeric vector with the indices of the text fragments.
Frequency
a numeric vector with the frequencies of the determiner 'het' in the text fragments.
The text of Max Havelaar was obtained from the Project Gutenberg at at http://www.gutenberg.org/wiki/Main_Page
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | ## Not run:
data(havelaar)
n = 1000 # token size of text fragments
p = mean(havelaar$Frequency / n) # relative frequencies
plot(qbinom(ppoints(99), n, p), sort(havelaar$Frequency),
xlab = paste("quantiles of (", n, ",", round(p, 4),
")-binomial", sep=""), ylab = "frequencies")
lambda = mean(havelaar$Frequency)
ks.test(havelaar$Frequency, "ppois", lambda)
ks.test(jitter(havelaar$Frequency), "ppois", lambda)
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.