knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.path = "man/figures/README-", out.width = "100%" ) library(dplyr) library(ggplot2) library(scales) library(purrr) library(ggstream) options(warn = -1)
The goal of ggstream
is to create a simple but powerful implementation of streamplot/streamgraph in ggplot2
. A streamplot is a stacked area plot mostly used for time series.
Install ggstream
from CRAN:
install.packages("ggstream")
Or you can install the development version of ggstream from github with:
remotes::install_github("davidsjoberg/ggstream")
The characteristic streamplot which creates a symmetrical area chart around the x axis.
blockbusters %>% ggplot(aes(year, box_office, fill = genre, label = genre, color = genre)) + geom_stream(extra_span = 0.013, type = "mirror", n_grid = 3000, bw = .78) + geom_stream_label(size = 4, type = "mirror", n_grid = 1000) + cowplot::theme_minimal_vgrid(font_size = 18) + theme(legend.position = "none") + # scale_y_continuous(limits = c(-400, 400)) + scale_colour_manual(values = paletteer::paletteer_d("dutchmasters::milkmaid") %>% colorspace::darken(.8)) + scale_fill_manual(values = paletteer::paletteer_d("dutchmasters::milkmaid") %>% colorspace::lighten(.2)) + labs(title = "Box office per genre 1977-2019", x = NULL, y = "Current dollars, billions")
Which is equivalent to a stacked area chart.
blockbusters %>% ggplot(aes(year, box_office, fill = genre, label = genre, color = genre)) + geom_stream(extra_span = 0.013, type = "ridge", n_grid = 3000, bw = .78) + geom_stream_label(size = 4, type = "ridge", n_grid = 1000) + cowplot::theme_minimal_vgrid(font_size = 18) + theme(legend.position = "none") + # scale_y_continuous(limits = c(-400, 400)) + scale_colour_manual(values = paletteer::paletteer_d("dutchmasters::milkmaid") %>% colorspace::darken(.8)) + scale_fill_manual(values = paletteer::paletteer_d("dutchmasters::milkmaid") %>% colorspace::lighten(.2)) + labs(title = "Box office per genre 1977-2019", x = NULL, y = "Current dollars, billions")
The type proportional
shows the share of each group in percent. Stacked like the ridge
type.
blockbusters %>% ggplot(aes(year, box_office, fill = genre, label = genre, color = genre)) + geom_stream(extra_span = 0.013, type = "proportional", n_grid = 3000, bw = .78) + geom_stream_label(size = 4, type = "proportional", n_grid = 1000) + cowplot::theme_minimal_vgrid(font_size = 18) + theme(legend.position = "none") + scale_y_continuous(labels = scales::percent_format(1)) + scale_colour_manual(values = paletteer::paletteer_d("dutchmasters::milkmaid") %>% colorspace::darken(.8)) + scale_fill_manual(values = paletteer::paletteer_d("dutchmasters::milkmaid") %>% colorspace::lighten(.2)) + labs(title = "Box office per genre 1977-2019", x = NULL, y = "Current dollars, percent")
By Cédric Scherer. Code here.
By Georgios Karamanis. Code here.
This is a basic example:
library(ggstream) ggplot(blockbusters, aes(year, box_office, fill = genre)) + geom_stream()
ggstream
also features a custom labeling geom that places decent default labels.
ggplot(blockbusters, aes(year, box_office, fill = genre)) + geom_stream() + geom_stream_label(aes(label = genre))
library(cowplot) library(paletteer) library(dplyr) library(colorspace) blockbusters %>% ggplot(aes(year, box_office, fill = genre, label = genre, color = genre)) + geom_stream(extra_span = 0.013, type = "mirror", n_grid = 3000, bw = .78) + geom_stream_label(size = 4, type = "mirror", n_grid = 1000) + cowplot::theme_minimal_vgrid(font_size = 18) + theme(legend.position = "none") + scale_colour_manual(values = paletteer::paletteer_d("dutchmasters::milkmaid") %>% colorspace::darken(.8)) + scale_fill_manual(values = paletteer::paletteer_d("dutchmasters::milkmaid") %>% colorspace::lighten(.2)) + labs(title = "Box office per genre 1977-2019", x = NULL, y = "Current dollars, billions")
The main parameter to adjust in geom_stream
is probably the bandwidth, or bw
. A lower bandwidth creates a more bumpy plot and a higher bandwidth smooth out some variation. Below is an illustration of how different bandwidths affect the stream plot.
library(patchwork) base <- ggplot(blockbusters, aes(year, box_office, fill = genre)) + theme(legend.position = "none") (base + geom_stream(bw = 0.5) + ggtitle("bw = 0.5")) / (base + geom_stream() + ggtitle("Default (bw = 0.75)")) / (base + geom_stream(bw = 1) + ggtitle("bw = 1"))
Another important parameter to adjust is extra_span
. This parameter adjust if a larger range than the range of the data which can help if the edges of the stream plot grows too large due in the estimation function. The additional range is set to y = 0 which forces the area towards zero. The cut-off can include the extra range or fit the data. Too illustrate this rather unintuitive parameter some variations are shown below. The transparent areas show the full estimation, and the solid area is the final plot.
base <- ggplot(blockbusters, aes(year, box_office, fill = genre)) + theme(legend.position = "none") + xlim(1970, 2028) (base + geom_stream() + ggtitle("Default")) / (base + geom_stream(extra_span = 0.001) + geom_stream(extra_span = 0.001, true_range = "none", alpha = .3) + ggtitle("extra_span = 0.001")) / (base + geom_stream(extra_span = .1) + geom_stream(extra_span = .1, true_range = "none", alpha = .3) + ggtitle("extra_span = .1")) / (base + geom_stream(extra_span = .2) + geom_stream(extra_span = .2, true_range = "none", alpha = .3) + ggtitle("extra_span = .2")) / (base + geom_stream(extra_span = .2, true_range = "none") + ggtitle("extra_span = .2 and true_range = \"none\""))
Another feature of stream plots is the sorting of groups in the stacking. The default of ggstream
is to stack as factor order of the fill
aesthetics. However, ggstream
supports two other stackning sorting options. The onset
and inside_out
.
library(patchwork) set.seed(123) df <- map_dfr(1:30, ~{ x <- 1:sample(1:70, 1) tibble(x = x + sample(1:150, 1)) %>% mutate(y = sample(1:10, length(x), replace = T), k = .x %>% as.character()) }) p <- df %>% ggplot(aes(x, y, fill = k)) + theme_void() + theme(legend.position = "none") p1 <- p + geom_stream(color = "black") + ggtitle("None (Default)") p2 <- p + geom_stream(color = "black", sorting = "inside_out") + ggtitle("Inside out") p3 <- p + geom_stream(color = "black", sorting = "onset") + ggtitle("Onset") p1 / p2 / p3
The ggstream
package provides some flexible ways to make stream plots but with decent defaults. However, due to the complexity of the underlying smoothing/estimation it should be used carefully and mostly for fun too illustrate major trends.
If you find a bug or have ideas for additional feature you are more than welcome to open an issue.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.