Overview of function

stacked_bar_chart can be used to create the following kinds of charts:

As with all oidnChaRts libraries, you are advised to load the htmlwidget library you're using directly.

data_stacked_bar_chart

library(oidnChaRts)

This vignette covers the use of stacked bar charts for visualising data with a variety of htmlwidget libraries, for demonstration purposes we use the following dataset generated from https://doi.org/10.6084/m9.figshare.3761562. The dataset concerns the number of jobs advertised on a number of "freelance websites" on a specific date, here's how the data looks:

head(data_stacked_bar_chart)

The original dataset includes measurements since 2016/09/05, this snapshot is re-generated everytime the oidnChaRts library is rebuilt, the current value is r data_stacked_bar_chart$timestamp[1]. The columns of interest are as follows:

Stacked barchart specifications

In stacked/grouped bar chart there are "categories" and "subcategories" of observations, here are possible combinations for our dataset:

In order to create these charts, we must first process the data using the tidyverse:

library(tidyverse)
library(oidnChaRts)
data_stacked_bar_chart %>%
  group_by(country_group, country) %>%
  summarise(total = sum(count)) %>%
  ungroup()

This code will be used in all further examples in this document, note that for your analyses it might make more sense to store the output of this code as a new symbol.

Grouped Bar Charts

Grouped barcharts group data into categories and then display a separate bar for each subcategory.

descending_order_of_occupations <- data_stacked_bar_chart %>%
  group_by(country_group, occupation) %>%
  mutate(total = sum(count)) %>%
  mutate(total.in.group = sum(total)) %>%
  arrange(desc(total.in.group)) %>%
  ungroup() %>%
  select(occupation) %>%
  unique() %>%
  .[[1]]

descending_order_of_regions <- data_stacked_bar_chart %>%
  group_by(country_group, occupation) %>%
  summarise(total = sum(count)) %>%
  mutate(total.in.group = sum(total)) %>%
  arrange(desc(total.in.group)) %>%
  select(country_group) %>%
  unique() %>%
  .[[1]]

data_stacked_bar_chart %>%
  group_by(country_group, occupation) %>%
  summarise(total = sum(count)) %>%
  ungroup() %>%
  stacked_bar_chart(library = "highcharter",
                    categories.column = ~country_group,
                    categories.order = descending_order_of_regions,
                    subcategories.column = ~occupation,
                    subcategories.order = descending_order_of_occupations,
                    value.column = ~total)
remove(descending_order_of_occupations, descending_order_of_regions)

The following will create a generic grouped bar chart with highcharter:

library(highcharter)
data_stacked_bar_chart %>%
  group_by(country_group, occupation) %>%
  summarise(total = sum(count)) %>%
  ungroup() %>%
  stacked_bar_chart(library = "highcharter",
                    categories.column = ~country_group,
                    subcategories.column = ~occupation,
                    value.column = ~total)

The following will create a generic grouped bar chart with plotly:

library(plotly)
data_stacked_bar_chart %>%
  group_by(country_group, occupation) %>%
  summarise(total = sum(count)) %>%
  ungroup() %>%
  stacked_bar_chart(library = "plotly",
                    categories.column = ~country_group,
                    subcategories.column = ~occupation,
                    value.column = ~total)

At the time of writing r Sys.Date(), the plotly library does a poor job at ensuring the category labels (y-axis labels) are not chopped off. The layout function from plotly allows us to modify the margins and drop the unnecessary label for the y-axis.

data_stacked_bar_chart %>%
  group_by(country_group, occupation) %>%
  summarise(total = sum(count)) %>%
  ungroup() %>%
  stacked_bar_chart(library = "plotly",
                    categories.column = ~country_group,
                    subcategories.column = ~occupation,
                    value.column = ~total) %>%
  layout(margin = list(l = 150),
         yaxis = list(title = ""))

Ordering Categories

Category order can be supplied directly to the categories.order column.

## Note: future versions of the dataset may use different groupings, this code assumes "United States" and "United Kingdom" remain distinct entities in "country_group"
order_of_regions <- c(c("United States", "United Kingdom"), setdiff(unique(data_stacked_bar_chart$country_group), c("United States", "United Kingdom")))
data_stacked_bar_chart %>%
  group_by(country_group, occupation) %>%
  summarise(total = sum(count)) %>%
  ungroup() %>%
  stacked_bar_chart(library = "highcharter",
                    categories.column = ~country_group,
                    categories.order = order_of_regions,
                    subcategories.column = ~occupation,
                    value.column = ~total)

A good pattern to follow with grouped bar charts is to order the categories from "largest to smallest", i.e. the country_group with the most number of jobs at the top of the chart and the least at the bottom. This functionality is deliberately NOT provided by the stacked_bar_chart function, instead you must modify the data:

descending_order_of_regions <- data_stacked_bar_chart %>%
  group_by(country_group, occupation) %>%
  summarise(total = sum(count)) %>%
  mutate(total.in.group = sum(total)) %>%
  arrange(desc(total.in.group)) %>%
  select(country_group) %>%
  unique() %>%
  .[[1]]

data_stacked_bar_chart %>%
  group_by(country_group, occupation) %>%
  summarise(total = sum(count)) %>%
  ungroup() %>%
  stacked_bar_chart(library = "plotly",
                    categories.column = ~country_group,
                    categories.order = descending_order_of_regions,
                    subcategories.column = ~occupation,
                    value.column = ~total) %>%
  layout(margin = list(l = 150),
         yaxis = list(title = ""))

Explicitly Ordering Subcategories

Subcategory order can be supplied directly to the subcategories.order column.

## Note: future versions of the dataset may use different occupations, this code assumes "Writing and translation" and "Creative and multimedia" remain distinct entities in "occupation"
order_of_occupations <- c(c("Writing and translation", "Creative and multimedia"), setdiff(unique(data_stacked_bar_chart$occupation), c("Writing and translation", "Creative and multimedia")))
data_stacked_bar_chart %>%
  group_by(country_group, occupation) %>%
  summarise(total = sum(count)) %>%
  ungroup() %>%
  stacked_bar_chart(library = "highcharter",
                    categories.column = ~country_group,
                    categories.order = descending_order_of_regions,
                    subcategories.column = ~occupation,
                    subcategories.order = order_of_occupations,
                    value.column = ~total)

In addition to ordering the categories by descending size, it is useful to do the same for the subcategories:

descending_order_of_occupations <- data_stacked_bar_chart %>%
  group_by(country_group, occupation) %>%
  mutate(total = sum(count)) %>%
  mutate(total.in.group = sum(total)) %>%
  arrange(desc(total.in.group)) %>%
  ungroup() %>%
  select(occupation) %>%
  unique() %>%
  .[[1]]

data_stacked_bar_chart %>%
  group_by(country_group, occupation) %>%
  summarise(total = sum(count)) %>%
  ungroup() %>%
  stacked_bar_chart(library = "highcharter",
                    categories.column = ~country_group,
                    categories.order = descending_order_of_regions,
                    subcategories.column = ~occupation,
                    subcategories.order = descending_order_of_occupations,
                    value.column = ~total)

Frustratingly, at the time of writing r Sys.Date() the plotly library reverses the order of legends compared to bars:

data_stacked_bar_chart %>%
  group_by(country_group, occupation) %>%
  summarise(total = sum(count)) %>%
  ungroup() %>%
  stacked_bar_chart(library = "plotly",
                    categories.column = ~country_group,
                    categories.order = descending_order_of_regions,
                    subcategories.column = ~occupation,
                    subcategories.order = descending_order_of_occupations,
                    value.column = ~total) %>%
    layout(margin = list(l = 150),
         yaxis = list(title = ""))

Stacked Bar Charts

Stacked barcharts group data into categories and then create one bar with segments corresponding to the subcategory values. Note that there are two types of stacked:

descending_order_of_occupations <- data_stacked_bar_chart %>%
  group_by(country_group, occupation) %>%
  mutate(total = sum(count)) %>%
  mutate(total.in.group = sum(total)) %>%
  arrange(desc(total.in.group)) %>%
  ungroup() %>%
  select(occupation) %>%
  unique() %>%
  .[[1]]

descending_order_of_regions <- data_stacked_bar_chart %>%
  group_by(country_group, occupation) %>%
  summarise(total = sum(count)) %>%
  mutate(total.in.group = sum(total)) %>%
  arrange(desc(total.in.group)) %>%
  select(country_group) %>%
  unique() %>%
  .[[1]]

data_stacked_bar_chart %>%
  group_by(country_group, occupation) %>%
  summarise(total = sum(count)) %>%
  ungroup() %>%
  stacked_bar_chart(library = "highcharter",
                    stacking.type = "normal",
                    categories.column = ~country_group,
                    categories.order = descending_order_of_regions,
                    subcategories.column = ~occupation,
                    subcategories.order = descending_order_of_occupations,
                    value.column = ~total)
remove(descending_order_of_occupations, descending_order_of_regions)

The following will create a generic stacked bar chart with highcharter:

library(highcharter)
data_stacked_bar_chart %>%
  group_by(country_group, occupation) %>%
  summarise(total = sum(count)) %>%
  ungroup() %>%
  stacked_bar_chart(library = "highcharter",
                    stacking.type = "normal",
                    categories.column = ~country_group,
                    subcategories.column = ~occupation,
                    value.column = ~total)

It is convenient to enable grouped tooltips for highcharter, so that values for all subcategories are shown on hover. The following code will be used in all further examples with highcharter for stacked bar charts:

data_stacked_bar_chart %>%
  group_by(country_group, occupation) %>%
  summarise(total = sum(count)) %>%
  ungroup() %>%
  stacked_bar_chart(library = "highcharter",
                    stacking.type = "normal",
                    categories.column = ~country_group,
                    subcategories.column = ~occupation,
                    value.column = ~total) %>%
  hc_tooltip(shared = TRUE)

The following will create a generic stacked bar chart with plotly:

library(plotly)
data_stacked_bar_chart %>%
  group_by(country_group, occupation) %>%
  summarise(total = sum(count)) %>%
  ungroup() %>%
  stacked_bar_chart(library = "plotly",
                    stacking.type = "normal",
                    categories.column = ~country_group,
                    subcategories.column = ~occupation,
                    value.column = ~total)

At the time of writing r Sys.Date(), the plotly library does a poor job at ensuring the category labels (y-axis labels) are not chopped off. The layout function from plotly allows us to modify the margins and drop the unnecessary label for the y-axis.

data_stacked_bar_chart %>%
  group_by(country_group, occupation) %>%
  summarise(total = sum(count)) %>%
  ungroup() %>%
  stacked_bar_chart(library = "plotly",
                    stacking.type = "normal",
                    categories.column = ~country_group,
                    subcategories.column = ~occupation,
                    value.column = ~total) %>%
  layout(margin = list(l = 150),
         yaxis = list(title = ""))

Ordering Categories

A good pattern to follow with stacked bar charts is to order the categories from "largest to smallest", i.e. the country_group with the most number of jobs at the top of the chart and the least at the bottom. This functionality is deliberately NOT provided by the stacked_bar_chart function, instead you must modify the data:

descending_order_of_regions <- data_stacked_bar_chart %>%
  group_by(country_group, occupation) %>%
  summarise(total = sum(count)) %>%
  mutate(total.in.group = sum(total)) %>%
  arrange(desc(total.in.group)) %>%
  select(country_group) %>%
  unique() %>%
  .[[1]]

data_stacked_bar_chart %>%
  group_by(country_group, occupation) %>%
  summarise(total = sum(count)) %>%
  ungroup() %>%
  stacked_bar_chart(library = "plotly",
                    stacking.type = "normal",
                    categories.column = ~country_group,
                    categories.order = descending_order_of_regions,
                    subcategories.column = ~occupation,
                    value.column = ~total) %>%
  layout(margin = list(l = 150),
         yaxis = list(title = ""))

Explicitly Ordering Subcategories

In addition to ordering the categories by descending size, it is useful to do the same for the subcategories:

descending_order_of_occupations <- data_stacked_bar_chart %>%
  group_by(country_group, occupation) %>%
  mutate(total = sum(count)) %>%
  mutate(total.in.group = sum(total)) %>%
  arrange(desc(total.in.group)) %>%
  ungroup() %>%
  select(occupation) %>%
  unique() %>%
  .[[1]]

data_stacked_bar_chart %>%
  group_by(country_group, occupation) %>%
  summarise(total = sum(count)) %>%
  ungroup() %>%
  stacked_bar_chart(library = "highcharter",
                    stacking.type = "normal",
                    categories.column = ~country_group,
                    categories.order = descending_order_of_regions,
                    subcategories.column = ~occupation,
                    subcategories.order = descending_order_of_occupations,
                    value.column = ~total) %>%
  hc_tooltip(shared = TRUE)

Stacked Percentage {#stacked-percentage}

The stacked_bar_chart function automatically calculates within category percentages for subcategory values, for instance:

data_stacked_bar_chart %>%
  group_by(country_group, occupation) %>%
  summarise(total = sum(count)) %>%
  ungroup() %>%
  stacked_bar_chart(library = "highcharter",
                    stacking.type = "percent",
                    categories.column = ~country_group,
                    categories.order = descending_order_of_regions,
                    subcategories.column = ~occupation,
                    subcategories.order = descending_order_of_occupations,
                    value.column = ~total) %>%
  hc_tooltip(shared = TRUE)
data_stacked_bar_chart %>%
  group_by(country_group, occupation) %>%
  summarise(total = sum(count)) %>%
  ungroup() %>%
  stacked_bar_chart(library = "plotly",
                    stacking.type = "percent",
                    categories.column = ~country_group,
                    categories.order = descending_order_of_regions,
                    subcategories.column = ~occupation,
                    subcategories.order = descending_order_of_occupations,
                    value.column = ~total) %>%
  layout(margin = list(l = 150),
         yaxis = list(title = ""))

Advanced Tooltips

The tooltips for stacked barcharts often do not apply to all subcategories within a category by default, one must specify that the tooltip must be shared. Explicit customisation of the tooltip then requires some programming in JavaScript.

Here's a small example for highcharter:

my_stack <- data_frame(
  cat = rep(c("a","b","c","d", "e"), 5),
  subcat = rep(c("u", "x", "y", "z", "w"), each = 5),
  value = round(rnorm(25, mean = 40, sd = 10))
)


my_stack %>%
  stacked_bar_chart(
    library = "highcharter",
    categories.column = ~cat,
    subcategories.column = ~subcat,
    value.column = ~value
  ) %>%
  hc_plotOptions(series = list(stacking = "stack")) %>%
  hc_tooltip(
    formatter = JS("function(){

                  var subcat = '';
                  $.each(this.points,function(i, point){
                    subcat += '<b>' + this.point.series.name + ': <b>' + 
                                Highcharts.numberFormat(this.point.plotY, 1) +
                                '<br/>';
                  });

                  return 'Category: ' + this.x.name + '<br/>' +
                  subcat;
                   }"),
    shared = TRUE
  )


martinjhnhadley/oidnChaRts documentation built on May 21, 2019, 12:38 p.m.