stat_ecdf | R Documentation |
The empirical cumulative distribution function (ECDF) provides an alternative
visualisation of distribution. Compared to other visualisations that rely on
density (like geom_histogram()
), the ECDF doesn't require any
tuning parameters and handles both continuous and categorical variables.
The downside is that it requires more training to accurately interpret,
and the underlying visual tasks are somewhat more challenging.
stat_ecdf(
mapping = NULL,
data = NULL,
geom = "step",
position = "identity",
...,
n = NULL,
pad = TRUE,
na.rm = FALSE,
show.legend = NA,
inherit.aes = TRUE
)
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
geom |
The geometric object to use to display the data for this layer.
When using a
|
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
... |
Other arguments passed on to
|
n |
if NULL, do not interpolate. If not NULL, this is the number of points to interpolate with. |
pad |
If |
na.rm |
If |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
The statistic relies on the aesthetics assignment to guess which variable to use as the input and which to use as the output. Either x or y must be provided and one of them must be unused. The ECDF will be calculated on the given aesthetic and will be output on the unused one.
If the weight
aesthetic is provided, a weighted ECDF will be computed. In
this case, the ECDF is incremented by weight / sum(weight)
instead of
1 / length(x)
for each observation.
stat_ecdf()
understands the following aesthetics (required aesthetics are in bold):
x
or y
group
weight
Learn more about setting these aesthetics in vignette("ggplot2-specs")
.
These are calculated by the 'stat' part of layers and can be accessed with delayed evaluation.
After calculation, weights of individual observations (if supplied), are no longer available.
set.seed(1)
df <- data.frame(
x = c(rnorm(100, 0, 3), rnorm(100, 0, 10)),
g = gl(2, 100)
)
ggplot(df, aes(x)) +
stat_ecdf(geom = "step")
# Don't go to positive/negative infinity
ggplot(df, aes(x)) +
stat_ecdf(geom = "step", pad = FALSE)
# Multiple ECDFs
ggplot(df, aes(x, colour = g)) +
stat_ecdf()
# Using weighted eCDF
weighted <- data.frame(x = 1:10, weights = c(1:5, 5:1))
plain <- data.frame(x = rep(weighted$x, weighted$weights))
ggplot(plain, aes(x)) +
stat_ecdf(linewidth = 1) +
stat_ecdf(
aes(weight = weights),
data = weighted, colour = "green"
)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.