histboxp: Use plotly to Draw Stratified Spike Histogram and Box Plot...

View source: R/scat1d.s

histboxpR Documentation

Use plotly to Draw Stratified Spike Histogram and Box Plot Statistics


Uses plotly to draw horizontal spike histograms stratified by group, plus the mean (solid dot) and vertical bars for these quantiles: 0.05 (red, short), 0.25 (blue, medium), 0.50 (black, long), 0.75 (blue, medium), and 0.95 (red, short). The robust dispersion measure Gini's mean difference and the SD may optionally be added. These are shown as horizontal lines starting at the minimum value of x having a length equal to the mean difference or SD. Even when Gini's and SD are computed, they are not drawn unless the user clicks on their legend entry.

Spike histograms have the advantage of effectively showing the raw data for both small and huge datasets, and unlike box plots allow multi-modality to be easily seen.

histboxpM plots multiple histograms stacked vertically, for variables in a data frame having a common group variable (if any) and combined using plotly::subplot.

dhistboxp is like histboxp but no plotly graphics are actually drawn. Instead, a data frame suitable for use with plotlyM is returned. For dhistboxp an additional level of stratification strata is implemented. group causes a different result here to produce back-to-back histograms (in the case of two groups) for each level of strata.


histboxp(p = plotly::plot_ly(height=height), x, group = NULL,
         xlab=NULL, gmd=TRUE, sd=FALSE, bins = 100, wmax=190, mult=7,
         connect=TRUE, showlegend=TRUE)

dhistboxp(x, group = NULL, strata=NULL, xlab=NULL, 
          gmd=FALSE, sd=FALSE, bins = 100, nmin=5, ff1=1, ff2=1)

histboxpM(p=plotly::plot_ly(height=height, width=width), x, group=NULL,
          gmd=TRUE, sd=FALSE, width=NULL, nrows=NULL, ncols=NULL, ...)



plotly graphics object if already begun


a numeric vector, or for histboxpM a numeric vector or a data frame of numeric vectors, hopefully with label and units attributes


a discrete grouping variable. If omitted, defaults to a vector of ones


a discrete numeric stratification variable. Values are also used to space out different spike histograms. Defaults to a vector of ones.


x-axis label, defaults to labelled version include units of measurement if any


set to FALSE to not compute Gini's mean difference


set to TRUE to compute the SD


width in pixels


number of rows for layout of multiple plots


number of columns for layout of multiple plots. At most one of nrows,ncols should be specified.


number of equal-width bins to use for spike histogram. If the number of distinct values of x is less than bins, the actual values of x are used.


minimum number of non-missing observations for a group-stratum combination before the spike histogram and quantiles are drawn

ff1, ff2

fudge factors for position and bar length for spike histograms

wmax, mult

tweaks for margin to allocate


set to FALSE to suppress lines connecting quantiles


used if producing multiple plots to be combined with subplot; set to FALSE for all but one plot


other arguments for histboxpM that are passed to histboxp


a plotly object. For dhistboxp a data frame as expected by plotlyM


Frank Harrell

See Also

histSpike, plot.describe, scat1d


## Not run: 
dist <- c(rep(1, 500), rep(2, 250), rep(3, 600))
Distribution <- factor(dist, 1 : 3, c('Unimodal', 'Bimodal', 'Trimodal'))
x <- c(rnorm(500, 6, 1),
       rnorm(200, 3, .7), rnorm(50, 7, .4),
       rnorm(200, 2, .7), rnorm(300, 5.5, .4), rnorm(100, 8, .4))
histboxp(x=x, group=Distribution, sd=TRUE)
X <- data.frame(x, x2=runif(length(x)))
histboxpM(x=X, group=Distribution, ncols=2)  # separate plots

## End(Not run)

harrelfe/Hmisc documentation built on May 19, 2024, 4:13 a.m.