Nothing
#' Histograms and frequency polygons
#'
#' Visualise the distribution of a single continuous variable by dividing
#' the x axis into bins and counting the number of observations in each bin.
#' Histograms (`geom_histogram()`) display the counts with bars; frequency
#' polygons (`geom_freqpoly()`) display the counts with lines. Frequency
#' polygons are more suitable when you want to compare the distribution
#' across the levels of a categorical variable.
#'
#' `stat_bin()` is suitable only for continuous x data. If your x data is
#' discrete, you probably want to use [stat_count()].
#'
#' By default, the underlying computation (`stat_bin()`) uses 30 bins;
#' this is not a good default, but the idea is to get you experimenting with
#' different number of bins. You can also experiment modifying the `binwidth` with
#' `center` or `boundary` arguments. `binwidth` overrides `bins` so you should do
#' one change at a time. You may need to look at a few options to uncover
#' the full story behind your data.
#'
#' By default, the _height_ of the bars represent the counts within each bin.
#' However, there are situations where this behavior might produce misleading
#' plots (e.g., when non-equal-width bins are used), in which case it might be
#' preferable to have the _area_ of the bars represent the counts (by setting
#' `aes(y = after_stat(count / width))`). See example below.
#'
#' In addition to `geom_histogram()`, you can create a histogram plot by using
#' `scale_x_binned()` with [geom_bar()]. This method by default plots tick marks
#' in between each bar.
#'
#' @eval rd_orientation()
#'
#' @section Aesthetics:
#' `geom_histogram()` uses the same aesthetics as [geom_bar()];
#' `geom_freqpoly()` uses the same aesthetics as [geom_line()].
#'
#' @export
#' @inheritParams layer
#' @inheritParams geom_bar
#' @param geom,stat Use to override the default connection between
#' `geom_histogram()`/`geom_freqpoly()` and `stat_bin()`. For more information
#' at overriding these connections, see how the [stat][layer_stats] and
#' [geom][layer_geoms] arguments work.
#' @examples
#' ggplot(diamonds, aes(carat)) +
#' geom_histogram()
#' ggplot(diamonds, aes(carat)) +
#' geom_histogram(binwidth = 0.01)
#' ggplot(diamonds, aes(carat)) +
#' geom_histogram(bins = 200)
#' # Map values to y to flip the orientation
#' ggplot(diamonds, aes(y = carat)) +
#' geom_histogram()
#'
#' # For histograms with tick marks between each bin, use `geom_bar()` with
#' # `scale_x_binned()`.
#' ggplot(diamonds, aes(carat)) +
#' geom_bar() +
#' scale_x_binned()
#'
#' # Rather than stacking histograms, it's easier to compare frequency
#' # polygons
#' ggplot(diamonds, aes(price, fill = cut)) +
#' geom_histogram(binwidth = 500)
#' ggplot(diamonds, aes(price, colour = cut)) +
#' geom_freqpoly(binwidth = 500)
#'
#' # To make it easier to compare distributions with very different counts,
#' # put density on the y axis instead of the default count
#' ggplot(diamonds, aes(price, after_stat(density), colour = cut)) +
#' geom_freqpoly(binwidth = 500)
#'
#'
#' # When using the non-equal-width bins, we should set the area of the bars to
#' # represent the counts (not the height).
#' # Here we're using 10 equi-probable bins:
#' price_bins <- quantile(diamonds$price, probs = seq(0, 1, length = 11))
#'
#' ggplot(diamonds, aes(price)) +
#' geom_histogram(breaks = price_bins, color = "black") # misleading (height = count)
#'
#' ggplot(diamonds, aes(price, after_stat(count / width))) +
#' geom_histogram(breaks = price_bins, color = "black") # area = count
#'
#' if (require("ggplot2movies")) {
#' # Often we don't want the height of the bar to represent the
#' # count of observations, but the sum of some other variable.
#' # For example, the following plot shows the number of movies
#' # in each rating.
#' m <- ggplot(movies, aes(rating))
#' m + geom_histogram(binwidth = 0.1)
#'
#' # If, however, we want to see the number of votes cast in each
#' # category, we need to weight by the votes variable
#' m +
#' geom_histogram(aes(weight = votes), binwidth = 0.1) +
#' ylab("votes")
#'
#' # For transformed scales, binwidth applies to the transformed data.
#' # The bins have constant width on the transformed scale.
#' m +
#' geom_histogram() +
#' scale_x_log10()
#' m +
#' geom_histogram(binwidth = 0.05) +
#' scale_x_log10()
#'
#' # For transformed coordinate systems, the binwidth applies to the
#' # raw data. The bins have constant width on the original scale.
#'
#' # Using log scales does not work here, because the first
#' # bar is anchored at zero, and so when transformed becomes negative
#' # infinity. This is not a problem when transforming the scales, because
#' # no observations have 0 ratings.
#' m +
#' geom_histogram(boundary = 0) +
#' coord_transform(x = "log10")
#' # Use boundary = 0, to make sure we don't take sqrt of negative values
#' m +
#' geom_histogram(boundary = 0) +
#' coord_transform(x = "sqrt")
#'
#' # You can also transform the y axis. Remember that the base of the bars
#' # has value 0, so log transformations are not appropriate
#' m <- ggplot(movies, aes(x = rating))
#' m +
#' geom_histogram(binwidth = 0.5) +
#' scale_y_sqrt()
#' }
#'
#' # You can specify a function for calculating binwidth, which is
#' # particularly useful when faceting along variables with
#' # different ranges because the function will be called once per facet
#' ggplot(economics_long, aes(value)) +
#' facet_wrap(~variable, scales = 'free_x') +
#' geom_histogram(binwidth = \(x) 2 * IQR(x) / (length(x)^(1/3)))
geom_histogram <- make_constructor(
GeomBar, stat = "bin", position = "stack",
# Passed to bin stat:
binwidth = NULL, bins = NULL, orientation = NA
)
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.