position_waterfall: Stack Chart Elements on Cumulative Value

Description Usage Arguments Details Stacking Dodging Examples

Description

A waterfall chart is a bar chart where each segment starts where the prior segment left off. This is similar to a stacked bar chart, except that the stacking does not reset across x values. It is the visualization of a cumulative sum. Another similar type of chart is the candlestick plot, except those have "whiskers", and typically require you to manually specify the ymin and ymax values.

Usage

1
2
3
4
position_waterfall(width = NULL, preserve = c("total", "single"),
  reverse = FALSE, dodge = TRUE, vjust = getOption("ggbg.vjust"),
  vjust.mode = getOption("ggbg.vjust.mode"),
  signif = getOption("ggbg.signif"), y.start = 0)

Arguments

width

Dodging width, when different to the width of the individual elements. This is useful when you want to align narrow geoms with wider geoms. See the examples.

preserve

Should dodging preserve the total width of all elements at a position, or the width of a single element?

reverse

If TRUE, will reverse the default stacking order. This is useful if you're rotating both the plot and legend.

dodge

TRUE (default) or FALSE, controls how to resolve groups that overlap on the x axis. The default is to dodge them to form mini-waterfalls within each x value, but you can chose to stack them instead by setting dodge=FALSE. Negative and positive values are segregated prior to stacking so they do not overlap. Interpreting waterfall charts with stacked sub-groups is difficult when they contain negative values, so we recommend you use the default setting instead. Observations within a group that have the same x value are always stacked, so if you have both positive and negative values for any given x value you may want to consider segregating the positives and negatives in different groups.

vjust

like the vjust parameter for ggplot2::position_stack, except that by default the direction of justification follows the direction of the bar (see vjust.mode), and the default value is 0.5 instead of 1. This only has an effect on geoms with positions like text, points, or lines. The default setting places elements midway through the height of the corresponding waterfall step. The default value is convenient for labeling geom_col waterfalls. Use 1 to position at the "end" of each waterfall step. This is different to the vjust for geoms like geom_text where vjust=1 shift the text down, but it is consistent with what ggplot2::position_stack does.

vjust.mode

character(1L), one of "end" (default), or "top" where "top" results in the same behavior as in ggplot2::position_stack. "end" means the justification is relative to the "end" of the waterfall bar. So if a waterfall bar is heading down (i.e. negative y value), the "end" is at the bottom. If it heading up (i.e. positive y value), the "end" is at the top. For positive y values "end" and "top" do the same thing.

signif

integer(1L) between 1 and 22, defaults to 11, corresponds to the digits parameter for signif and is used to reduce the precision of numeric x aesthetic values so that stacking is not foiled by double precision imprecision.

y.start

numeric(1L), defaults to 0, will be starting point for the cumulative sum of y values. This could be useful if you want to combine waterfalls with other layers and need the waterfall to start at a specific value.

Details

position_waterfall creates waterfall charts when it is applied to geom_col or geom_bar. You can apply it to any geom, so long as the geom specifies a y aesthetic, and either an x aesthetic, or both xmin and xmax aesthetics. It may not make sense to apply position_waterfall to arbitrary geoms, particularly those that represent single graphical elements with multiple x/y coordinates such as geom_polygon. ymin/ymax aesthetics will be shifted by the cumulative y value.

Since stat layers are computed prior to position adjustments, you can also use position_waterfall with stats (e.g stat_bin, see examples).

We also implement a StatWaterfall ggproto object that can be accessed within geom_* calls by specifying stat='waterfall'. Unlike typical stat ggproto objects, this one does not have a layer instantiation function (i.e. stat_waterfall does not exist). The sole purpose of the stat is to compute the ycum aesthetic that can then be used by the geom layer (see the labeling examples).

Stacking

The stacking is always computed on the y aesthetic. The order of the stacking is determined by the x aesthetic. The actual position of the objects are also affected by vjust, and you may need to change the value of vjust if you are using position_waterfall with geoms other than columns. For example, for dimensionless elements such as geom_point and geom_text, the default vjust of 0.5 leads to alignment at the midpoint between previous and subsequent values in the cumulative sequence.

If only xmin and xmax aesthetics are present the x value will be inferred as the midpoint of those two.

Dodging

Unlike most position_* adjustments, position_waterfall adjust positions across different x values. However, we still need to resolve x value overlaps. The default approach is to apply the same type of adjustment across groups within any given x value. This stacks and dodges elements.

Dodging involves changing the width of the geom and also shifting the geom horizontally. Geom width adjustments will always be made based on the xmin/xmax/width aesthetics. The shifting itself can be controlled separately with position_waterfall(width=...). That parameter should really be called dodge.width to avoid confusion with the geom width, but we left it as width for consistency with position_dodge.

You you can turn off dodging within x values by setting position_waterfall(dodge=FALSE) which will result in stacking within each x value.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
## These examples are best run via `example(position_waterfall)`
library(ggplot2)
dat <- data.frame(x=3:1, y=1:3)
p1 <- ggplot(dat, aes(x=x, y=y)) + geom_col(position='waterfall')

## Add text or labels; defaults to middle waterfall position
## which can be modified with `vjust`
p1 + geom_label(aes(label=x), position='waterfall')

## We can also add the cumulative running to the top of
## the bars with `stat='waterfall'` and position adjustments
p1 + geom_label(aes(label=x), position='waterfall') +
 geom_label(
   stat="waterfall",        # adds `ycum` computed variable
   aes(label=stat(ycum)),   # which we can use for label
   position=position_waterfall(vjust=1), # text to end of column
   vjust=0,                              # tweak so it's on top
)
## A poor person's candlestick chart:
dat.r.walk <- data.frame(x=1:20, y=rnorm(20))
ggplot(dat.r.walk, aes(x=x, y=y, fill=y > 0)) +
  geom_col(position='waterfall')

## We can use arbitrary geoms
ggplot(dat, aes(x=x, y=y)) +
  geom_point() +
  geom_point(position='waterfall', color='blue') + # default vjust=0.5
  geom_point(position=position_waterfall(vjust=1), color='red')

## Or stats; here we turn a histogram into an ecdf plot
dat.norm <- data.frame(x=rnorm(1000))
ggplot(dat.norm, aes(x=x)) + geom_histogram(position='waterfall')
ggplot(dat.norm, aes(x=x)) + stat_bin(position='waterfall')

## Data with groups
dat3 <- data.frame(
  x=c(3, 2, 2, 2, 1, 1), y=c(-3, 1, 4, -6, -1, 10),
  grp=rep(c("A", "B", "C"), lenght.out=6)
)
p2 <- ggplot(dat3, aes(x=x, y=y, fill=grp))
p2 + geom_col(position="waterfall")

## Equal width columns
p2 + geom_col(position=position_waterfall(preserve='single'))

## Stacking groups is possible, bug hard to interpret when
## negative values present
p2 + geom_col(position=position_waterfall(dodge=FALSE))

brodieG/ggbg documentation built on May 16, 2019, 7:44 a.m.