stat_slabinterval | R Documentation |
"Meta" stat for computing distribution functions (densities or CDFs) + intervals for use with
geom_slabinterval()
. Useful for creating eye plots, half-eye plots, CCDF bar plots,
gradient plots, histograms, and more. Sample data can be supplied to the x
and y
aesthetics or analytical distributions (in a variety of formats) can be supplied to the
xdist
and ydist
aesthetics.
See Details.
stat_slabinterval(
mapping = NULL,
data = NULL,
geom = "slabinterval",
position = "identity",
...,
p_limits = c(NA, NA),
density = "bounded",
adjust = waiver(),
trim = TRUE,
expand = FALSE,
breaks = waiver(),
align = "none",
outline_bars = FALSE,
point_interval = "median_qi",
slab_type = NULL,
limits = NULL,
n = 501,
.width = c(0.66, 0.95),
orientation = NA,
na.rm = FALSE,
show.legend = c(size = FALSE),
inherit.aes = TRUE
)
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
geom |
Use to override the default connection between
|
position |
Position adjustment, either as a string, or the result of a call to a position adjustment function.
Setting this equal to |
... |
Other arguments passed to
|
p_limits |
Probability limits (as a vector of size 2) used to determine the lower and upper
limits of theoretical distributions (distributions from samples ignore this parameter and determine
their limits based on the limits of the sample). E.g., if this is |
density |
Density estimator for sample data. One of:
|
adjust |
Passed to |
trim |
For sample data, should the density estimate be trimmed to the range of the
data? Passed on to the density estimator; see the |
expand |
For sample data, should the slab be expanded to the limits of the scale? Default |
breaks |
Determines the breakpoints defining bins. Defaults to
For example, |
align |
Determines how to align the breakpoints defining bins. Default
(
For example, |
outline_bars |
For sample data (if |
point_interval |
A function from the |
slab_type |
(deprecated) The type of slab function to calculate: probability density (or mass) function
( |
limits |
Manually-specified limits for the slab, as a vector of length two. These limits are combined with those
computed based on |
n |
Number of points at which to evaluate the function that defines the slab. |
.width |
The |
orientation |
Whether this geom is drawn horizontally or vertically. One of:
For compatibility with the base ggplot naming scheme for |
na.rm |
If |
show.legend |
Should this layer be included in the legends? Default is |
inherit.aes |
If |
A highly configurable stat for generating a variety of plots that combine a "slab" that describes a distribution plus a point summary and any number of intervals. Several "shortcut" stats are provided which combine multiple options to create useful geoms, particularly eye plots (a violin plot of density plus interval), half-eye plots (a density plot plus interval), CCDF bar plots (a complementary CDF plus interval), and gradient plots (a density encoded in color alpha plus interval).
The shortcut stats include:
stat_eye()
: Eye plots (violin + interval)
stat_halfeye()
: Half-eye plots (density + interval)
stat_ccdfinterval()
: CCDF bar plots (CCDF + interval)
stat_cdfinterval()
: CDF bar plots (CDF + interval)
stat_gradientinterval()
: Density gradient + interval plots
stat_slab()
: Density plots
stat_histinterval()
: Histogram + interval plots
stat_pointinterval()
: Point + interval plots
stat_interval()
: Interval plots
To visualize sample data, such as a data distribution, samples from a
bootstrap distribution, or a Bayesian posterior, you can supply samples to
the x
or y
aesthetic.
To visualize analytical distributions, you can use the xdist
or ydist
aesthetic. For historical reasons, you can also use dist
to specify the distribution, though
this is not recommended as it does not work as well with orientation detection.
These aesthetics can be used as follows:
xdist
, ydist
, and dist
can be any distribution object from the distributional
package (dist_normal()
, dist_beta()
, etc) or can be a posterior::rvar()
object.
Since these functions are vectorized,
other columns can be passed directly to them in an aes()
specification; e.g.
aes(dist = dist_normal(mu, sigma))
will work if mu
and sigma
are columns in the
input data frame.
dist
can be a character vector giving the distribution name. Then the arg1
, ... arg9
aesthetics (or args
as a list column) specify distribution arguments. Distribution names
should correspond to R functions that have "p"
, "q"
, and "d"
functions; e.g. "norm"
is a valid distribution name because R defines the pnorm()
, qnorm()
, and dnorm()
functions for Normal distributions.
See the parse_dist()
function for a useful way to generate dist
and args
values from human-readable distribution specs (like "normal(0,1)"
). Such specs are also
produced by other packages (like the brms::get_prior
function in brms); thus,
parse_dist()
combined with the stats described here can help you visualize the output
of those functions.
A ggplot2::Stat representing a slab or combined slab+interval geometry which can
be added to a ggplot()
object.
The following variables are computed by this stat and made available for
use in aesthetic specifications (aes()
) using the after_stat()
function or the after_stat
argument of stage()
:
x
or y
: For slabs, the input values to the slab function.
For intervals, the point summary from the interval function. Whether it is x
or y
depends on orientation
xmin
or ymin
: For intervals, the lower end of the interval from the interval function.
xmax
or ymax
: For intervals, the upper end of the interval from the interval function.
.width
: For intervals, the interval width as a numeric value in [0, 1]
.
For slabs, the width of the smallest interval containing that value of the slab.
level
: For intervals, the interval width as an ordered factor.
For slabs, the level of the smallest interval containing that value of the slab.
pdf
: For slabs, the probability density function (PDF).
If options("ggdist.experimental.slab_data_in_intervals")
is TRUE
:
For intervals, the PDF at the point summary; intervals also have pdf_min
and pdf_max
for the PDF at the lower and upper ends of the interval.
cdf
: For slabs, the cumulative distribution function.
If options("ggdist.experimental.slab_data_in_intervals")
is TRUE
:
For intervals, the CDF at the point summary; intervals also have cdf_min
and cdf_max
for the CDF at the lower and upper ends of the interval.
n
: For slabs, the number of data points summarized into that slab. If the slab was created from
an analytical distribution via the xdist
, ydist
, or dist
aesthetic, n
will be Inf
.
f
: (deprecated) For slabs, the output values from the slab function (such as the PDF, CDF, or CCDF),
determined by slab_type
. Instead of using slab_type
to change f
and then mapping f
onto an
aesthetic, it is now recommended to simply map the corresponding computed variable (e.g. pdf
, cdf
, or
1 - cdf
) directly onto the desired aesthetic.
The slab+interval stat
s and geom
s have a wide variety of aesthetics that control
the appearance of their three sub-geometries: the slab, the point, and
the interval.
These stat
s support the following aesthetics:
x
: x position of the geometry (when orientation = "vertical"
); or sample data to be summarized
(when orientation = "horizontal"
with sample data).
y
: y position of the geometry (when orientation = "horizontal"
); or sample data to be summarized
(when orientation = "vertical"
with sample data).
weight
: When using samples (i.e. the x
and y
aesthetics, not xdist
or ydist
), optional
weights to be applied to each draw.
xdist
: When using analytical distributions, distribution to map on the x axis: a distributional
object (e.g. dist_normal()
) or a posterior::rvar()
object.
ydist
: When using analytical distributions, distribution to map on the y axis: a distributional
object (e.g. dist_normal()
) or a posterior::rvar()
object.
dist
: When using analytical distributions, a name of a distribution (e.g. "norm"
), a
distributional object (e.g. dist_normal()
), or a posterior::rvar()
object. See Details.
args
: Distribution arguments (args
or arg1
, ... arg9
). See Details.
In addition, in their default configuration (paired with geom_slabinterval()
)
the following aesthetics are supported by the underlying geom:
Slab-specific aesthetics
thickness
: The thickness of the slab at each x
value (if orientation = "horizontal"
) or
y
value (if orientation = "vertical"
) of the slab.
side
: Which side to place the slab on. "topright"
, "top"
, and "right"
are synonyms
which cause the slab to be drawn on the top or the right depending on if orientation
is "horizontal"
or "vertical"
. "bottomleft"
, "bottom"
, and "left"
are synonyms which cause the slab
to be drawn on the bottom or the left depending on if orientation
is "horizontal"
or
"vertical"
. "topleft"
causes the slab to be drawn on the top or the left, and "bottomright"
causes the slab to be drawn on the bottom or the right. "both"
draws the slab mirrored on both
sides (as in a violin plot).
scale
: What proportion of the region allocated to this geom to use to draw the slab. If scale = 1
,
slabs that use the maximum range will just touch each other. Default is 0.9
to leave some space
between adjacent slabs. For a comprehensive discussion and examples of slab scaling and normalization,
see the thickness
scale article.
justification
: Justification of the interval relative to the slab, where 0
indicates bottom/left
justification and 1
indicates top/right justification (depending on orientation
). If justification
is NULL
(the default), then it is set automatically based on the value of side
: when side
is
"top"
/"right"
justification
is set to 0
, when side
is "bottom"
/"left"
justification
is set to 1
, and when side
is "both"
justification
is set to 0.5.
datatype
: When using composite geoms directly without a stat
(e.g. geom_slabinterval()
), datatype
is used to
indicate which part of the geom a row in the data targets: rows with datatype = "slab"
target the
slab portion of the geometry and rows with datatype = "interval"
target the interval portion of
the geometry. This is set automatically when using ggdist stat
s.
Interval-specific aesthetics
xmin
: Left end of the interval sub-geometry (if orientation = "horizontal"
).
xmax
: Right end of the interval sub-geometry (if orientation = "horizontal"
).
ymin
: Lower end of the interval sub-geometry (if orientation = "vertical"
).
ymax
: Upper end of the interval sub-geometry (if orientation = "vertical"
).
Point-specific aesthetics
shape
: Shape type used to draw the point sub-geometry.
Color aesthetics
colour
: (or color
) The color of the interval and point sub-geometries.
Use the slab_color
, interval_color
, or point_color
aesthetics (below) to
set sub-geometry colors separately.
fill
: The fill color of the slab and point sub-geometries. Use the slab_fill
or point_fill
aesthetics (below) to set sub-geometry colors separately.
alpha
: The opacity of the slab, interval, and point sub-geometries. Use the slab_alpha
,
interval_alpha
, or point_alpha
aesthetics (below) to set sub-geometry colors separately.
colour_ramp
: (or color_ramp
) A secondary scale that modifies the color
scale to "ramp" to another color. See scale_colour_ramp()
for examples.
fill_ramp
: A secondary scale that modifies the fill
scale to "ramp" to another color. See scale_fill_ramp()
for examples.
Line aesthetics
linewidth
: Width of the line used to draw the interval (except with geom_slab()
: then
it is the width of the slab). With composite geometries including an interval and slab,
use slab_linewidth
to set the line width of the slab (see below). For interval, raw
linewidth
values are transformed according to the interval_size_domain
and interval_size_range
parameters of the geom
(see above).
size
: Determines the size of the point. If linewidth
is not provided, size
will
also determines the width of the line used to draw the interval (this allows line width and
point size to be modified together by setting only size
and not linewidth
). Raw
size
values are transformed according to the interval_size_domain
, interval_size_range
,
and fatten_point
parameters of the geom
(see above). Use the point_size
aesthetic
(below) to set sub-geometry size directly without applying the effects of
interval_size_domain
, interval_size_range
, and fatten_point
.
stroke
: Width of the outline around the point sub-geometry.
linetype
: Type of line (e.g., "solid"
, "dashed"
, etc) used to draw the interval
and the outline of the slab (if it is visible). Use the slab_linetype
or
interval_linetype
aesthetics (below) to set sub-geometry line types separately.
Slab-specific color and line override aesthetics
slab_fill
: Override for fill
: the fill color of the slab.
slab_colour
: (or slab_color
) Override for colour
/color
: the outline color of the slab.
slab_alpha
: Override for alpha
: the opacity of the slab.
slab_linewidth
: Override for linwidth
: the width of the outline of the slab.
slab_linetype
: Override for linetype
: the line type of the outline of the slab.
Interval-specific color and line override aesthetics
interval_colour
: (or interval_color
) Override for colour
/color
: the color of the interval.
interval_alpha
: Override for alpha
: the opacity of the interval.
interval_linetype
: Override for linetype
: the line type of the interval.
Point-specific color and line override aesthetics
point_fill
: Override for fill
: the fill color of the point.
point_colour
: (or point_color
) Override for colour
/color
: the outline color of the point.
point_alpha
: Override for alpha
: the opacity of the point.
point_size
: Override for size
: the size of the point.
Deprecated aesthetics
slab_size
: Use slab_linewidth
.
interval_size
: Use interval_linewidth
.
Other aesthetics (these work as in standard geom
s)
width
height
group
See examples of some of these aesthetics in action in vignette("slabinterval")
.
Learn more about the sub-geom override aesthetics (like interval_color
) in the
scales documentation. Learn more about basic ggplot aesthetics in
vignette("ggplot2-specs")
.
See geom_slabinterval()
for more information on the geom these stats
use by default and some of the options it has.
See vignette("slabinterval")
for a variety of examples of use.
library(dplyr)
library(ggplot2)
library(distributional)
theme_set(theme_ggdist())
# EXAMPLES ON SAMPLE DATA
set.seed(1234)
df = data.frame(
group = c("a", "b", "c", "c", "c"),
value = rnorm(2500, mean = c(5, 7, 9, 9, 9), sd = c(1, 1.5, 1, 1, 1))
)
# here are vertical eyes:
df %>%
ggplot(aes(x = group, y = value)) +
stat_eye()
# note the sample size is not automatically incorporated into the
# area of the densities in case one wishes to plot densities against
# a reference (e.g. a prior distribution).
# But you may wish to account for sample size if using these geoms
# for something other than visualizing posteriors; in which case
# you can use after_stat(f*n):
df %>%
ggplot(aes(x = group, y = value)) +
stat_eye(aes(thickness = after_stat(pdf*n)))
# EXAMPLES ON ANALYTICAL DISTRIBUTIONS
dist_df = tribble(
~group, ~subgroup, ~mean, ~sd,
"a", "h", 5, 1,
"b", "h", 7, 1.5,
"c", "h", 8, 1,
"c", "i", 9, 1,
"c", "j", 7, 1
)
# Using functions from the distributional package (like dist_normal()) with the
# dist aesthetic can lead to more compact/expressive specifications
dist_df %>%
ggplot(aes(x = group, ydist = dist_normal(mean, sd), fill = subgroup)) +
stat_eye(position = "dodge")
# using the old character vector + args approach
dist_df %>%
ggplot(aes(x = group, dist = "norm", arg1 = mean, arg2 = sd, fill = subgroup)) +
stat_eye(position = "dodge")
# the stat_slabinterval family applies a Jacobian adjustment to densities
# when plotting on transformed scales in order to plot them correctly.
# It determines the Jacobian using symbolic differentiation if possible,
# using stats::D(). If symbolic differentation fails, it falls back
# to numericDeriv(), which is less reliable; therefore, it is
# advisable to use scale transformation functions that are defined in
# terms of basic math functions so that their derivatives can be
# determined analytically (most of the transformation functions in the
# scales package currently have this property).
# For example, here is a log-Normal distribution plotted on the log
# scale, where it will appear Normal:
data.frame(dist = "lnorm", logmean = log(10), logsd = 2*log(10)) %>%
ggplot(aes(y = 1, dist = dist, arg1 = logmean, arg2 = logsd)) +
stat_halfeye() +
scale_x_log10(breaks = 10^seq(-5,7, by = 2))
# see vignette("slabinterval") for many more examples.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.