Trellis Displays of Tukey's Hanging Rootograms
Description
Displays hanging rootograms.
Usage
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28  rootogram(x, ...)
## S3 method for class 'formula'
rootogram(x, data = parent.frame(),
ylab = expression(sqrt(P(X == x))),
prepanel = prepanel.rootogram,
panel = panel.rootogram,
...,
probability = TRUE)
prepanel.rootogram(x, y = table(x),
dfun = NULL,
transformation = sqrt,
hang = TRUE,
probability = TRUE,
...)
panel.rootogram(x, y = table(x),
dfun = NULL,
col = plot.line$col,
lty = plot.line$lty,
lwd = plot.line$lwd,
alpha = plot.line$alpha,
transformation = sqrt,
hang = TRUE,
probability = TRUE,
type = "l", pch = 16,
...)

Arguments
x, y 
For A similar interpretation holds for Note that the data are assumed to arise from a discrete distribution with some probability mass function. See details below. 
data 
For the 
dfun 
a probability mass function, to be evaluated at unique x values 
prepanel, panel 
panel and prepanel function used to create the display. 
ylab 
the yaxis label; typically a character string or an expression. 
col, lty, lwd, alpha 
graphical parameters 
transformation 
a vectorized function. Relative frequencies
(observed) and theoretical probabilities ( 
hang 
logical, whether lines representing observed relative freuqncies should “hang” from the curve representing the theoretical probabilities. 
probability 
A logical flag, controlling whether the yvalues are to be standardized to be probabilities by dividing by their sum. 
type 
A character vector consisting of one or both of

pch 
The plotting character to be used for the 
... 
extra arguments, passed on as appropriate. Standard
lattice arguments as well as arguments to 
Details
This function implements Tukey's hanging rootograms. As implemented,
rootogram
assumes that the data arise from a discrete
distribution (either supplied in raw form, when y
is
unspecified, or in terms of the frequency distribution) with some
unknown probability mass function (p.m.f.). The purpose of the plot
is to check whether the supplied theoretical p.m.f. dfun
is a
reasonable fit for the data.
It is reasonable to consider rootograms for continuous data by
discretizing it (similar to a histogram), but this must be done by the
user before calling rootogram
. An example is given below.
Also consider the rootogram
function in the vcd
package,
especially if the number of unique values is small.
Value
rootogram
produces an object of class "trellis"
. The
update
method can be used to update components of the object and
the print
method (usually called by default) will plot it on an
appropriate plotting device.
Author(s)
Deepayan Sarkar deepayan.sarkar@gmail.com
References
John W. Tukey (1972) Some graphic and semigraphic displays. In T. A. Bancroft (Ed) Statistical Papers in Honor of George W. Snedecor, pp. 293–316. Available online at http://www.edwardtufte.com/tufte/tukey
See Also
xyplot
Examples
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75  library(lattice)
x < rpois(1000, lambda = 50)
p < rootogram(~x, dfun = function(x) dpois(x, lambda = 50))
p
lambdav < c(30, 40, 50, 60, 70)
update(p[rep(1, length(lambdav))],
aspect = "xy",
panel = function(x, ...) {
panel.rootogram(x,
dfun = function(x)
dpois(x, lambda = lambdav[panel.number()]))
})
lambdav < c(46, 48, 50, 52, 54)
update(p[rep(1, length(lambdav))],
aspect = "xy",
prepanel = function(x, ...) {
tmp <
lapply(lambdav,
function(lambda) {
prepanel.rootogram(x,
dfun = function(x)
dpois(x, lambda = lambda))
})
list(xlim = range(sapply(tmp, "[[", "xlim")),
ylim = range(sapply(tmp, "[[", "ylim")),
dx = do.call("c", lapply(tmp, "[[", "dx")),
dy = do.call("c", lapply(tmp, "[[", "dy")))
},
panel = function(x, ...) {
panel.rootogram(x,
dfun = function(x)
dpois(x, lambda = lambdav[panel.number()]))
grid::grid.text(bquote(Poisson(lambda == .(foo)),
where = list(foo = lambdav[panel.number()])),
y = 0.15,
gp = grid::gpar(cex = 1.5))
},
xlab = "",
sub = "Random sample from Poisson(50)")
## Example using continuous data
xnorm < rnorm(1000)
## 'discretize' by binning and replacing data by bin midpoints
h < hist(xnorm, plot = FALSE)
## Option 1: Assume bin probabilities proportional to dnorm()
norm.factor < sum(dnorm(h$mids, mean(xnorm), sd(xnorm)))
rootogram(counts ~ mids, data = h,
dfun = function(x) {
dnorm(x, mean(xnorm), sd(xnorm)) / norm.factor
})
## Option 2: Compute probabilities explicitly using pnorm()
pdisc < diff(pnorm(h$breaks, mean = mean(xnorm), sd = sd(xnorm)))
pdisc < pdisc / sum(pdisc)
rootogram(counts ~ mids, data = h,
dfun = function(x) {
f < factor(x, levels = h$mids)
pdisc[f]
})
