type_density | R Documentation |
Type function for density plots.
type_density(
bw = "nrd0",
joint.bw = c("mean", "full", "none"),
adjust = 1,
kernel = c("gaussian", "epanechnikov", "rectangular", "triangular", "biweight",
"cosine", "optcosine"),
n = 512,
alpha = NULL
)
bw |
the smoothing bandwidth to be used. The kernels are scaled such that this is the standard deviation of the smoothing kernel. (Note this differs from the reference books cited below.)
The specified (or computed) value of |
joint.bw |
character string indicating whether (and how) the smoothing
bandwidth should be computed from the joint data distribution when there
are multiple subgroups. The options are |
adjust |
the bandwidth used is actually |
kernel |
a character string giving the smoothing kernel to be used. This
must partially match one of
|
n |
the number of equally spaced points at which the density is
to be estimated. When |
alpha |
numeric value between 0 and 1 specifying the opacity of ribbon shading
If no |
The algorithm used in density.default
disperses the mass of the
empirical distribution function over a regular grid of at least 512
points and then uses the fast Fourier transform to convolve this
approximation with a discretized version of the kernel and then uses
linear approximation to evaluate the density at the specified points.
The statistical properties of a kernel are determined by
\sigma^2_K = \int t^2 K(t) dt
which is always = 1
for our kernels (and hence the bandwidth
bw
is the standard deviation of the kernel) and
R(K) = \int K^2(t) dt
.
MSE-equivalent bandwidths (for different kernels) are proportional to
\sigma_K R(K)
which is scale invariant and for our
kernels equal to R(K)
. This value is returned when
give.Rkern = TRUE
. See the examples for using exact equivalent
bandwidths.
Infinite values in x
are assumed to correspond to a point mass at
+/-Inf
and the density estimate is of the sub-density on
(-Inf, +Inf)
.
While the choice of smoothing bandwidth will
always stand to affect a density visualization, it gains an added
importance when multiple densities are drawn simultaneously (e.g., for
subgroups with respect to by
or facet
). Allowing each subgroup to
compute its own separate bandwidth independently offers greater flexibility
in capturing the unique characteristics of each subgroup, particularly when
distributions differ substantially in location and/or scale. However, this
approach may overemphasize small random variations and make it harder to
visually compare densities across subgroups. Hence, it is often useful to
employ the same ("joint") bandwidth across all subgroups. The following
strategies are available via the joint.bw
argument:
The default joint.bw = "mean"
first computes the individual bandwidths
for each group but then computes their mean, weighted by the number of
observations in each group. This will work well when all groups have
similar amounts of scatter (similar variances), even when they have
potentially rather different locations. The weighted averaging stabilizes
potential fluctuations in the individual bandwidths, especially when some
subgroups are rather small.
Alternatively, joint.bw = "full"
can be used to compute the joint
bandwidth from the full joint distribution (merging all groups). This will
yield an even more robust bandwidth, especially when the groups overlap
substantially (i.e., have similar locations and scales). However, it may
lead to too large bandwidths and thus too much smoothing, especially when
the locations of the groups differ substantially.
Finally, joint.bw = "none"
disables the joint bandwidth so that each
group just employs its individual bandwidth. This is often the best choice
if the amounts of scatter differ substantially between the groups, thus
necessitating different amounts of smoothing.
This tinyplot method for density plots differs from the base
plot.density
function in its treatment of titles. The
x-axis title displays only the variable name, omitting details about the
number of observations and smoothing bandwidth. Additionally, the main
title is left blank by default for a cleaner appearance.
# "density" type convenience string
tinyplot(~Sepal.Length, data = iris, type = "density")
# grouped density example
tinyplot(~Sepal.Length | Species, data = iris, type = "density")
# use `bg = "by"` (or, equivalent `fill = "by"`) to get filled densities
tinyplot(~Sepal.Length | Species, data = iris, type = "density", fill = "by")
# use `type_density()` to pass extra arguments for customization
tinyplot(
~Sepal.Length | Species, data = iris,
type = type_density(bw = "SJ"),
main = "Bandwidth computed using Sheather & Jones (1991)"
)
# The default for grouped density plots is to use the mean of the
# individual subgroup bandwidths (weighted by group size) as the
# joint bandwidth. Alternatively, the bandwidth from the "full"
# data or separate individual bandwidths ("none") can be used.
tinyplot(~Sepal.Length | Species, data = iris,
ylim = c(0, 1.25), type = "density") # mean (default)
tinyplot_add(joint.bw = "full", lty = 2) # full data
tinyplot_add(joint.bw = "none", lty = 3) # none (individual)
legend("topright", c("Mean", "Full", "None"), lty = 1:3, bty = "n", title = "Joint BW")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.