ZIPF: The zipf and zero adjusted zipf distributions for fitting a...

Description Usage Arguments Details Value Note Author(s) References See Also Examples

Description

This function ZIPF() defines the zipf distribution, Johnson et. al., (2005), sections 11.2.20, p 527-528. The zipf distribution is an one parameter distribution with long tails (a discete version of the Pareto distrbution). The function ZIPF() creates a gamlss.family object to be used in GAMLSS fitting. The functions dZIPF, pZIPF, qZIPF and rZIPF define the density, distribution function, quantile function and random generation for the zipf, ZIPF(), distribution. The function zetaP() defines the zeta function and it is based on the zeta function defined on the VGAM package of Thomas Yee, see Yee (2017).

The distribution zipf is defined on y=1,2,3, ...,Inf, the zero adjusted zipf permits values on y=0,1,2,3, ...,Inf. The function ZAZIPF() defines the zero adjusted zipf distribution. The function ZAZIPF() creates a gamlss.family object to be used in GAMLSS fitting. The functions dZAZIPF, pZAZIPF, qZAZIPF and rZAZIPF define the density, distribution function, quantile function and random generation for the zero adjusted zipf, ZAZIPF(), distribution.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
ZIPF(mu.link = "log")
dZIPF(x, mu = 1, log = FALSE)
pZIPF(q, mu = 1, lower.tail = TRUE, log.p = FALSE)
qZIPF(p, mu = 1, lower.tail = TRUE, log.p = FALSE, 
       max.value = 10000)
rZIPF(n, mu = 1, max.value = 10000)
zetaP(x)
ZAZIPF(mu.link = "log", sigma.link = "logit")
dZAZIPF(x, mu = 0.5, sigma = 0.1, log = FALSE)
pZAZIPF(q, mu = 0.5, sigma = 0.1, lower.tail = TRUE, 
        log.p = FALSE)
qZAZIPF(p, mu = 0.5, sigma = 0.1, lower.tail = TRUE, 
       log.p = FALSE, max.value = 10000)
rZAZIPF(n, mu = 0.5, sigma = 0.1, max.value = 10000)

Arguments

mu.link

the link function for the parameter mu with default log

x,q

vectors of (non-negative integer) quantiles

p

vector of probabilities

mu

vector of positive parameter

log, log.p

logical; if TRUE, probabilities p are given as log(p)

lower.tail

logical; if TRUE (default), probabilities are P[X <= x], otherwise, P[X > x]

n

number of random values to return

max.value

a constant, set to the default value of 10000, It is used in the q function which numerically calculates how far the algorithm should look for q. Maybe for zipf data the values has to increase at a considerable computational cost.

sigma.link

the link function for the parameter aigma with default logit

sigma

a vector of probabilities of zero

Details

The probability density for the zipf distribution, ZIPF, is:

f(y|mu)= y^{-(μ+1)}/ζ(μ+1)

for y=1,2, ...,Inf, μ>0 and where ζ() is the (Reimann) zeta function.

The distribution has mean ζ(μ)/ζ(μ+1) and variance {ζ(μ+1)ζ(μ-1)-[ζ(μ)]^2 }/ [ζ(μ+1)]^2.

Value

The function ZIPF() returns a gamlss.family object which can be used to fit a zipf distribution in the gamlss() function.

Note

Because the zipf distrbution has very long tails the max.value in the q and r, may need to increase.

Author(s)

Mikis Stasinopoulos and Bob Rigby

References

N. L. Johnson, A. W. Kemp, and S. Kotz. (2005) Univariate Discrete Distributions. Wiley, New York, 3rd edition.

Thomas W. Yee (2017). VGAM: Vector Generalized Linear and Additive Models. R package version 1.0-3. https://CRAN.R-project.org/package=VGAM

Rigby, R. A. and Stasinopoulos D. M. (2005). Generalized additive models for location, scale and shape,(with discussion), Appl. Statist., 54, part 3, pp 507-554.

Stasinopoulos D. M. Rigby R.A. (2007) Generalized additive models for location scale and shape (GAMLSS) in R. Journal of Statistical Software, Vol. 23, Issue 7, Dec 2007, http://www.jstatsoft.org/v23/i07.

Stasinopoulos D. M., Rigby R.A., Heller G., Voudouris V., and De Bastiani F., (2017) Flexible Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC.

See Also

PO, LG, GEOM, YULE

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
# ZIPF
par(mfrow=c(2,2))
y<-seq(1,20,1)
plot(y, dZIPF(y), type="h")
q <- seq(1, 20, 1)
plot(q, pZIPF(q), type="h")
p<-seq(0.0001,0.999,0.05)
plot(p , qZIPF(p), type="s")
dat <- rZIPF(100)
hist(dat)
# ZAZIPF
y<-seq(0,20,1)
plot(y, dZAZIPF(y,  mu=.9, sigma=.1), type="h")
q <- seq(1, 20, 1)
plot(q, pZAZIPF(q,  mu=.9, sigma=.1), type="h")
p<-seq(0.0001,0.999,0.05)
plot(p, qZAZIPF(p,  mu=.9, sigma=.1), type="s")
dat <- rZAZIPF(100, mu=.9, sigma=.1)
hist(dat)

Stan125/gamlss.dist documentation built on May 12, 2019, 7:38 a.m.