lnre_vgc: Expected Vocabulary Growth Curves of LNRE Model (zipfR)

Description Usage Arguments Details Value See Also Examples

Description

lnre.vgc computes expected vocabulary growth curves E[V(N)] according to a LNRE model, returning an object of class vgc. Data points are returned for the specified values of N, optionally including estimated variances and/or growth curves for the spectrum elements E[V_m(N)].

Usage

1
  lnre.vgc(model, N, m.max=0, variances=FALSE)

Arguments

model

an object belonging to a subclass of lnre, representing a LNRE model

N

an increasing sequence of non-negative integers, specifying the sample sizes N for which vocabulary growth data should be calculated

m.max

if specified, include vocabulary growth curves E[V_m(N)] for spectrum elements up to m.max. Must be a single integer in the range 1 … 9.

variances

if TRUE, include variance estimates for the vocabulary size (and the spectrum elements, if applicable)

Details

~~ TODO, if any ~~

Value

An object of class vgc, representing the expected vocabulary growth curve E[V(N)] of the LNRE model lnre, with data points at the sample sizes N.

If m.max is specified, expected growth curves E[V_m(N)] for spectrum elements (hapax legomena, dis legomena, etc.) up to m.max are also computed.

If variances=TRUE, the vgc object includes variance data for all growth curves.

See Also

vgc for more information about vocabulary growth curves and links to relevant functions; lnre for more information about LNRE models and how to initialize them

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
## load Dickens dataset and estimate lnre models
data(Dickens.spc)

zm <- lnre("zm",Dickens.spc)
fzm <- lnre("fzm",Dickens.spc,exact=FALSE)
gigp <- lnre("gigp",Dickens.spc)

## compute expected V and V_1 growth up to 100 million tokens
## in 100 steps of 1 million tokens
zm.vgc <- lnre.vgc(zm,(1:100)*1e6, m.max=1)
fzm.vgc <- lnre.vgc(fzm,(1:100)*1e6, m.max=1)
gigp.vgc <- lnre.vgc(gigp,(1:100)*1e6, m.max=1)

## compare
plot(zm.vgc,fzm.vgc,gigp.vgc,add.m=1,legend=c("ZM","fZM","GIGP"))

## load Italian ultra- prefix data
data(ItaUltra.spc)

## compute zm model
zm <- lnre("zm",ItaUltra.spc)

## compute vgc up to about twice the sample size
## with variance of V
zm.vgc <- lnre.vgc(zm,(1:100)*70, variances=TRUE)

## plot with confidence intervals derived from variance in
## vgc (with larger datasets, ci will typically be almost
## invisible)
plot(zm.vgc)

zipfR documentation built on Nov. 13, 2020, 3:01 a.m.