panel.bpplot  R Documentation 
For all their good points, box plots have a high ink/information ratio in that they mainly display 3 quartiles. Many practitioners have found that the "outer values" are difficult to explain to nonstatisticians and many feel that the notion of "outliers" is too dependent on (false) expectations that data distributions should be Gaussian.
panel.bpplot
is a panel
function for use with
trellis
, especially for bwplot
. It draws box plots
(without the whiskers) with any number of userspecified "corners"
(corresponding to different quantiles), but it also draws boxpercentile
plots similar to those drawn by Jeffrey Banfield's
(umsfjban@bill.oscs.montana.edu) bpplot
function.
To quote from Banfield, "boxpercentile plots supply more
information about the univariate distributions. At any height the
width of the irregular 'box' is proportional to the percentile of that
height, up to the 50th percentile, and above the 50th percentile the
width is proportional to 100 minus the percentile. Thus, the width at
any given height is proportional to the percent of observations that
are more extreme in that direction. As in boxplots, the median, 25th
and 75th percentiles are marked with line segments across the box."
panel.bpplot
can also be used with base graphics to add extended
box plots to an existing plot, by specifying nogrid=TRUE, height=...
.
panel.bpplot
is a generalization of bpplot
and
panel.bwplot
in
that it works with trellis
(making the plots horizontal so that
category labels are more visable), it allows the user to specify the
quantiles to connect and those for which to draw reference lines,
and it displays means (by default using dots).
bpplt
draws horizontal boxpercentile plot much like those drawn
by panel.bpplot
but taking as the starting point a matrix
containing quantiles summarizing the data. bpplt
is primarily
intended to be used internally by plot.summary.formula.reverse
or
plot.summaryM
but when used with no arguments has a general purpose: to draw an
annotated example boxpercentile plot with the default quantiles used
and with the mean drawn with a solid dot. This schematic plot is
rendered nicely in postscript with an image height of 3.5 inches.
bppltp
is like bpplt
but for plotly
graphics, and
it does not draw an annotated extended box plot example.
bpplotM
uses the lattice
bwplot
function to depict
multiple numeric continuous variables with varying scales in a single
lattice
graph, after reshaping the dataset into a tall and thin
format.
panel.bpplot(x, y, box.ratio=1, means=TRUE, qref=c(.5,.25,.75), probs=c(.05,.125,.25,.375), nout=0, nloc=c('right lower', 'right', 'left', 'none'), cex.n=.7, datadensity=FALSE, scat1d.opts=NULL, violin=FALSE, violin.opts=NULL, font=box.dot$font, pch=box.dot$pch, cex.means =box.dot$cex, col=box.dot$col, nogrid=NULL, height=NULL, ...) # E.g. bwplot(formula, panel=panel.bpplot, panel.bpplot.parameters) bpplt(stats, xlim, xlab='', box.ratio = 1, means=TRUE, qref=c(.5,.25,.75), qomit=c(.025,.975), pch=16, cex.labels=par('cex'), cex.points=if(prototype)1 else 0.5, grid=FALSE) bppltp(p=plotly::plot_ly(), stats, xlim, xlab='', box.ratio = 1, means=TRUE, qref=c(.5,.25,.75), qomit=c(.025,.975), teststat=NULL, showlegend=TRUE) bpplotM(formula=NULL, groups=NULL, data=NULL, subset=NULL, na.action=NULL, qlim=0.01, xlim=NULL, nloc=c('right lower','right','left','none'), vnames=c('labels', 'names'), cex.n=.7, cex.strip=1, outerlabels=TRUE, ...)
x 
continuous variable whose distribution is to be examined 
y 
grouping variable 
box.ratio 
see 
means 
set to 
qref 
vector of quantiles for which to draw reference lines. These do not
need to be included in 
probs 
vector of quantiles to display in the box plot. These should all be
less than 0.5; the mirrorimage quantiles are added automatically. By
default, 
nout 
tells the function to use 
nloc 
location to plot number of non 
cex.n 
character size for 
datadensity 
set to 
scat1d.opts 
a list containing named arguments (without abbreviations) to pass to

violin 
set to 
violin.opts 
a list of options to pass to 
cex.means 
character size for dots representing means 
font,pch,col 
see 
nogrid 
set to 
height 
if 
... 
arguments passed to 
stats,xlim,xlab,qomit,cex.labels,cex.points,grid 
undocumented arguments to 
p 
an alreadystarted 
teststat 
an html expression containing a test statistic 
showlegend 
set to 
formula 
a formula with continuous numeric analysis variables on
the left hand side and stratification variables on the right.
The first variable on the right is the one that will vary the
fastest, forming the 
groups 
see above 
data 
an optional data frame 
subset 
an optional subsetting expression or logical vector 
na.action 
specifies a function to possibly subset the data
according to 
qlim 
the outer quantiles to use for scaling each panel in

vnames 
default is to use variable 
cex.strip 
character size for panel strip labels 
outerlabels 
if 
Frank Harrell
Department of Biostatistics
Vanderbilt University School of Medicine
fh@fharrell.com
Esty WW, Banfield J: The boxpercentile plot. J Statistical Software 8 No. 17, 2003.
bpplot
, panel.bwplot
,
scat1d
, quantile
,
Ecdf
, summaryP
,
useOuterStrips
set.seed(13) x < rnorm(1000) g < sample(1:6, 1000, replace=TRUE) x[g==1][1:20] < rnorm(20)+3 # contaminate 20 x's for group 1 # default trellis box plot require(lattice) bwplot(g ~ x) # boxpercentile plot with data density (rug plot) bwplot(g ~ x, panel=panel.bpplot, probs=seq(.01,.49,by=.01), datadensity=TRUE) # add ,scat1d.opts=list(tfrac=1) to make all tick marks the same size # when a group has > 125 observations # small dot for means, show only .05,.125,.25,.375,.625,.75,.875,.95 quantiles bwplot(g ~ x, panel=panel.bpplot, cex.means=.3) # suppress means and reference lines for lower and upper quartiles bwplot(g ~ x, panel=panel.bpplot, probs=c(.025,.1,.25), means=FALSE, qref=FALSE) # continuous plot up until quartiles ("Tootsie Roll plot") bwplot(g ~ x, panel=panel.bpplot, probs=seq(.01,.25,by=.01)) # start at quartiles then make it continuous ("coffin plot") bwplot(g ~ x, panel=panel.bpplot, probs=seq(.25,.49,by=.01)) # same as previous but add a spike to give 0.95 interval bwplot(g ~ x, panel=panel.bpplot, probs=c(.025,seq(.25,.49,by=.01))) # decile plot with reference lines at outer quintiles and median bwplot(g ~ x, panel=panel.bpplot, probs=c(.1,.2,.3,.4), qref=c(.5,.2,.8)) # default plot with tick marks showing all observations outside the outer # box (.05 and .95 quantiles), with very small ticks bwplot(g ~ x, panel=panel.bpplot, nout=.05, scat1d.opts=list(frac=.01)) # show 5 smallest and 5 largest observations bwplot(g ~ x, panel=panel.bpplot, nout=5) # Use a scat1d option (preserve=TRUE) to ensure that the right peak extends # to the same position as the extreme scat1d bwplot(~x , panel=panel.bpplot, probs=seq(.00,.5,by=.001), datadensity=TRUE, scat1d.opt=list(preserve=TRUE)) # Add an extended box plot to an existing base graphics plot plot(x, 1:length(x)) panel.bpplot(x, 1070, nogrid=TRUE, pch=19, height=15, cex.means=.5) # Draw a prototype showing how to interpret the plots bpplt() # Example for bpplotM set.seed(1) n < 800 d < data.frame(treatment=sample(c('a','b'), n, TRUE), sex=sample(c('female','male'), n, TRUE), age=rnorm(n, 40, 10), bp =rnorm(n, 120, 12), wt =rnorm(n, 190, 30)) label(d$bp) < 'Systolic Blood Pressure' units(d$bp) < 'mmHg' bpplotM(age + bp + wt ~ treatment, data=d) bpplotM(age + bp + wt ~ treatment * sex, data=d, cex.strip=.8) bpplotM(age + bp + wt ~ treatment*sex, data=d, violin=TRUE, violin.opts=list(col=adjustcolor('blue', alpha.f=.15), border=FALSE)) bpplotM(c('age', 'bp', 'wt'), groups='treatment', data=d) # Can use Hmisc Cs function, e.g. Cs(age, bp, wt) bpplotM(age + bp + wt ~ treatment, data=d, nloc='left') # Without treatment: bpplotM(age + bp + wt ~ 1, data=d) ## Not run: # Automatically find all variables that appear to be continuous getHdata(support) bpplotM(data=support, group='dzgroup', cex.strip=.4, cex.means=.3, cex.n=.45) # Separate displays for categorical vs. continuous baseline variables getHdata(pbc) pbc < upData(pbc, moveUnits=TRUE) s < summaryM(stage + sex + spiders ~ drug, data=pbc) plot(s) Key(0, .5) s < summaryP(stage + sex + spiders ~ drug, data=pbc) plot(s, val ~ freq  var, groups='drug', pch=1:3, col=1:3, key=list(x=.6, y=.8)) bpplotM(bili + albumin + protime + age ~ drug, data=pbc) ## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.