kInflatedLogConDiscr: Compute a mixture of a point mass at 0 and a log-concave...

Description Usage Arguments Details Value Author(s) References Examples

View source: R/kInflatedLogConDiscr.r

Description

Using an EM algorithm, compute an estimate of a mixture of a point mass at k and a log-concave probability mass function from discrete i.i.d. data.

Usage

1
2
kInflatedLogConDiscr(x, k = 0, prec1 = 1e-10, prec2 = 1e-15, 
    itermax = 200, output = TRUE, theta0 = 0.5, p0 = NA)

Arguments

x

Vector of observations.

k

Point at which inflation should be assumed. Must be in x_1, x_1 + 1, …, x_{n - 1}, x_n.

prec1

Precision for stopping criterion.

prec2

Precision to remove ends of support in case weights <prec2.

itermax

Maximal number of iterations of EM algorithm.

output

Logical, if TRUE, progress of EM algorithm is shown.

theta0

Optional initialization for θ_0, the point mass at k.

p0

Optional initialization for the PMF.

Details

Given a vector of observations x_n = (x_1, …, x_n) from a discrete PMF with a (potential) point mass at k (typically k = 0), kInflatedLogConDiscr computes a pmf that is a mixture between a point mass at k and a log-concave PMF on x. To accomplish this, an EM algorithm is used.

Value

A list containing the following elementes:

z

The support.

f

The estimated k-inflated log-concave PMF.

E(L)

The value of the expected composite log-likelihood at the maximum.

loglik

The value of the composite log-likelihood at the maximum.

theta

The estimated weight at k.

logconc.pmf

The log-concave part of the mixture.

logconc.z

The support of logconc.pmf.

Author(s)

Kaspar Rufibach (maintainer) kaspar.rufibach@gmail.com
http://www.kasparrufibach.ch
Fadoua Balabdaoui fadoua@ceremade.dauphine.fr
http://www.ceremade.dauphine.fr/~fadoua
Hanna Jankowski hkj@mathstat.yorku.ca
http://www.math.yorku.ca/~hkj
Kathrin Weyermann

References

Balabdaoui, F., Jankowski, H., Rufibach, K., and Pavlides, M. (2013). Maximum likelihood estimation and confidence bands for a discrete log-concave distribution. J. R. Stat. Soc. Ser. B Stat. Methodol., 75(4), 769–790.

Weyermann, K. (2007). An Active Set Algorithm for Log-Concave Discrete Distributions. MSc thesis, University of Bern (Supervisor: Lutz Duembgen).

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
## -----------------------------------------------
## generate zero-inflated negative binomial sample
## -----------------------------------------------
set.seed(2011)
n <- 100
theta <- 0.05
r <- 6
p <- 0.3
x <- rnbinom(n, r, p)

## inflate at 0
x <- ifelse(runif(n) <= theta, 0, x)

## estimate log-concave MLE
fit1 <- logConDiscrMLE(x, w = NA, psi_o = NA, prec = 1e-05, output = TRUE)

## estimate zero-inflated log-concave MLE
fit2 <- kInflatedLogConDiscr(x, k = 0)

## plot the results
par(mfrow = c(1, 1), las = 1)
plot(fit1$x, exp(fit1$psi), type = "b", col = 2, xlim = range(x), xlab = "x", 
    ylim = c(0, max(exp(fit1$psi), fit2$f)), ylab = "PMF", 
    main = "Estimate MLE from a zero-inflated negative-binomial", pch = 19)
lines(fit2$z, fit2$f, type = "b", col = 4, pch = 15)

## add the true PMF we sampled from
z <- fit2$z
f.true <- theta * c(1, rep(0, length(z) - 1)) + (1 - theta) * dnbinom(z, r, p)
lines(z, f.true, col = 6, type = "b", pch = 17)

legend("topright", c("log-concave MLE", "zero-inflated log-concave MLE", 
    "true PMF"), col = c(2, 4, 6), lty = c(1, 1, 1), pch = c(19, 15, 17), 
    bty = "n")
    
## Not run: 
## -----------------------------------------------
## generate seven-inflated negative binomial sample
## -----------------------------------------------
theta <- 0.05
r <- 4
p <- 0.3
n <- 10000
x <- rnbinom(n, r, p)
x <- ifelse(runif(n) <= theta, 7, x)
x <- c(x, rep(7, 10))

## compute different estimates
zero.mle <- kInflatedLogConDiscr(x, k = 7)
mle <- logConDiscrMLE(x, output = FALSE)
f.mle <- exp(mle$psiSupp)
z<-	zero.mle$z
f1 <- theta * c(rep(0, 7 - min(x)), 1, rep(0, max(x) - 7))
f2 <- (1 - theta) * dnbinom(z, r, p)
f.true <- f1 + f2
true <- dnbinom(z, r, p)
f.fit <- zero.mle$f
xx <- sort(unique(x))
emp <- rep(0, length(z))
emp[xx - min(x) + 1] <- as.vector(table(x) / n)

## plot results
plot(z, f.true, type = "l", ylim = c(0, max(emp)), xlab = "x", 
    ylab = "PMF", main = "Illustration k-inflated estimator")
points(z, true, type = "l", lty = 2)
points(z, f.fit, type = "l", col = "red")
points(z, zero.mle$logconc.pmf, type = "l", col = "red", lty = 2)
points(min(x):max(x), f.mle, type = "l", col = "green")
points(z, emp, type = "l", col = "purple")
points(z, emp, col = "purple")
legend("topright", inset = 0.05, c("true", "true less seven", "seven-inflated", 
    "recovered", "logconc", "empirical"), lty=c(1, 2, 1, 2, 1, 1), col = c("black", 
    "black", "red", "red", "green", "purple"))

## zoom in near mode
subs <- 4:12
plot(z[subs], f.true[subs], type = "l", ylim = c(0, max(emp)), xlab = "x", 
    ylab = "PMF", main = "Illustration k-inflated estimator")
points(z[subs], true[subs], type = "l", lty = 2)
points(z[subs], f.fit[subs], type = "l", col = "red")
points(z[subs], zero.mle$logconc.pmf[subs], type = "l", col = "red", lty = 2)
points((min(x):max(x))[subs], f.mle[subs], type = "l", col = "green")
points(z[subs], emp[subs], type = "l", col = "purple")
points(z[subs], emp[subs], col = "purple")

## End(Not run)

logcondiscr documentation built on May 2, 2019, 3:35 p.m.