stacked_bar_chart
can be used to create the following kinds of charts:
As with all oidnChaRts
libraries, you are advised to load the htmlwidget library you're using directly.
library(oidnChaRts)
This vignette covers the use of stacked bar charts for visualising data with a variety of htmlwidget libraries, for demonstration purposes we use the following dataset generated from https://doi.org/10.6084/m9.figshare.3761562. The dataset concerns the number of jobs advertised on a number of "freelance websites" on a specific date, here's how the data looks:
head(data_stacked_bar_chart)
The original dataset includes measurements since 2016/09/05, this snapshot is re-generated everytime the oidnChaRts
library is rebuilt, the current value is r data_stacked_bar_chart$timestamp[1]
. The columns of interest are as follows:
In stacked/grouped bar chart there are "categories" and "subcategories" of observations, here are possible combinations for our dataset:
In order to create these charts, we must first process the data using the tidyverse:
library(tidyverse) library(oidnChaRts) data_stacked_bar_chart %>% group_by(country_group, country) %>% summarise(total = sum(count)) %>% ungroup()
This code will be used in all further examples in this document, note that for your analyses it might make more sense to store the output of this code as a new symbol.
Grouped barcharts group data into categories and then display a separate bar for each subcategory.
descending_order_of_occupations <- data_stacked_bar_chart %>% group_by(country_group, occupation) %>% mutate(total = sum(count)) %>% mutate(total.in.group = sum(total)) %>% arrange(desc(total.in.group)) %>% ungroup() %>% select(occupation) %>% unique() %>% .[[1]] descending_order_of_regions <- data_stacked_bar_chart %>% group_by(country_group, occupation) %>% summarise(total = sum(count)) %>% mutate(total.in.group = sum(total)) %>% arrange(desc(total.in.group)) %>% select(country_group) %>% unique() %>% .[[1]] data_stacked_bar_chart %>% group_by(country_group, occupation) %>% summarise(total = sum(count)) %>% ungroup() %>% stacked_bar_chart(library = "highcharter", categories.column = ~country_group, categories.order = descending_order_of_regions, subcategories.column = ~occupation, subcategories.order = descending_order_of_occupations, value.column = ~total) remove(descending_order_of_occupations, descending_order_of_regions)
The following will create a generic grouped bar chart with highcharter:
library(highcharter) data_stacked_bar_chart %>% group_by(country_group, occupation) %>% summarise(total = sum(count)) %>% ungroup() %>% stacked_bar_chart(library = "highcharter", categories.column = ~country_group, subcategories.column = ~occupation, value.column = ~total)
The following will create a generic grouped bar chart with plotly:
library(plotly) data_stacked_bar_chart %>% group_by(country_group, occupation) %>% summarise(total = sum(count)) %>% ungroup() %>% stacked_bar_chart(library = "plotly", categories.column = ~country_group, subcategories.column = ~occupation, value.column = ~total)
At the time of writing r Sys.Date()
, the plotly library does a poor job at ensuring the category labels (y-axis labels) are not chopped off. The layout
function from plotly
allows us to modify the margins and drop the unnecessary label for the y-axis.
data_stacked_bar_chart %>% group_by(country_group, occupation) %>% summarise(total = sum(count)) %>% ungroup() %>% stacked_bar_chart(library = "plotly", categories.column = ~country_group, subcategories.column = ~occupation, value.column = ~total) %>% layout(margin = list(l = 150), yaxis = list(title = ""))
Category order can be supplied directly to the categories.order
column.
## Note: future versions of the dataset may use different groupings, this code assumes "United States" and "United Kingdom" remain distinct entities in "country_group" order_of_regions <- c(c("United States", "United Kingdom"), setdiff(unique(data_stacked_bar_chart$country_group), c("United States", "United Kingdom"))) data_stacked_bar_chart %>% group_by(country_group, occupation) %>% summarise(total = sum(count)) %>% ungroup() %>% stacked_bar_chart(library = "highcharter", categories.column = ~country_group, categories.order = order_of_regions, subcategories.column = ~occupation, value.column = ~total)
A good pattern to follow with grouped bar charts is to order the categories from "largest to smallest", i.e. the country_group with the most number of jobs at the top of the chart and the least at the bottom. This functionality is deliberately NOT provided by the stacked_bar_chart
function, instead you must modify the data:
descending_order_of_regions <- data_stacked_bar_chart %>% group_by(country_group, occupation) %>% summarise(total = sum(count)) %>% mutate(total.in.group = sum(total)) %>% arrange(desc(total.in.group)) %>% select(country_group) %>% unique() %>% .[[1]] data_stacked_bar_chart %>% group_by(country_group, occupation) %>% summarise(total = sum(count)) %>% ungroup() %>% stacked_bar_chart(library = "plotly", categories.column = ~country_group, categories.order = descending_order_of_regions, subcategories.column = ~occupation, value.column = ~total) %>% layout(margin = list(l = 150), yaxis = list(title = ""))
Subcategory order can be supplied directly to the subcategories.order
column.
## Note: future versions of the dataset may use different occupations, this code assumes "Writing and translation" and "Creative and multimedia" remain distinct entities in "occupation" order_of_occupations <- c(c("Writing and translation", "Creative and multimedia"), setdiff(unique(data_stacked_bar_chart$occupation), c("Writing and translation", "Creative and multimedia"))) data_stacked_bar_chart %>% group_by(country_group, occupation) %>% summarise(total = sum(count)) %>% ungroup() %>% stacked_bar_chart(library = "highcharter", categories.column = ~country_group, categories.order = descending_order_of_regions, subcategories.column = ~occupation, subcategories.order = order_of_occupations, value.column = ~total)
In addition to ordering the categories by descending size, it is useful to do the same for the subcategories:
descending_order_of_occupations <- data_stacked_bar_chart %>% group_by(country_group, occupation) %>% mutate(total = sum(count)) %>% mutate(total.in.group = sum(total)) %>% arrange(desc(total.in.group)) %>% ungroup() %>% select(occupation) %>% unique() %>% .[[1]] data_stacked_bar_chart %>% group_by(country_group, occupation) %>% summarise(total = sum(count)) %>% ungroup() %>% stacked_bar_chart(library = "highcharter", categories.column = ~country_group, categories.order = descending_order_of_regions, subcategories.column = ~occupation, subcategories.order = descending_order_of_occupations, value.column = ~total)
Frustratingly, at the time of writing r Sys.Date()
the plotly
library reverses the order of legends compared to bars:
data_stacked_bar_chart %>% group_by(country_group, occupation) %>% summarise(total = sum(count)) %>% ungroup() %>% stacked_bar_chart(library = "plotly", categories.column = ~country_group, categories.order = descending_order_of_regions, subcategories.column = ~occupation, subcategories.order = descending_order_of_occupations, value.column = ~total) %>% layout(margin = list(l = 150), yaxis = list(title = ""))
Stacked barcharts group data into categories and then create one bar with segments corresponding to the subcategory values. Note that there are two types of stacked:
descending_order_of_occupations <- data_stacked_bar_chart %>% group_by(country_group, occupation) %>% mutate(total = sum(count)) %>% mutate(total.in.group = sum(total)) %>% arrange(desc(total.in.group)) %>% ungroup() %>% select(occupation) %>% unique() %>% .[[1]] descending_order_of_regions <- data_stacked_bar_chart %>% group_by(country_group, occupation) %>% summarise(total = sum(count)) %>% mutate(total.in.group = sum(total)) %>% arrange(desc(total.in.group)) %>% select(country_group) %>% unique() %>% .[[1]] data_stacked_bar_chart %>% group_by(country_group, occupation) %>% summarise(total = sum(count)) %>% ungroup() %>% stacked_bar_chart(library = "highcharter", stacking.type = "normal", categories.column = ~country_group, categories.order = descending_order_of_regions, subcategories.column = ~occupation, subcategories.order = descending_order_of_occupations, value.column = ~total) remove(descending_order_of_occupations, descending_order_of_regions)
The following will create a generic stacked bar chart with highcharter:
library(highcharter) data_stacked_bar_chart %>% group_by(country_group, occupation) %>% summarise(total = sum(count)) %>% ungroup() %>% stacked_bar_chart(library = "highcharter", stacking.type = "normal", categories.column = ~country_group, subcategories.column = ~occupation, value.column = ~total)
It is convenient to enable grouped tooltips for highcharter
, so that values for all subcategories are shown on hover. The following code will be used in all further examples with highcharter
for stacked bar charts:
data_stacked_bar_chart %>% group_by(country_group, occupation) %>% summarise(total = sum(count)) %>% ungroup() %>% stacked_bar_chart(library = "highcharter", stacking.type = "normal", categories.column = ~country_group, subcategories.column = ~occupation, value.column = ~total) %>% hc_tooltip(shared = TRUE)
The following will create a generic stacked bar chart with plotly:
library(plotly) data_stacked_bar_chart %>% group_by(country_group, occupation) %>% summarise(total = sum(count)) %>% ungroup() %>% stacked_bar_chart(library = "plotly", stacking.type = "normal", categories.column = ~country_group, subcategories.column = ~occupation, value.column = ~total)
At the time of writing r Sys.Date()
, the plotly library does a poor job at ensuring the category labels (y-axis labels) are not chopped off. The layout
function from plotly
allows us to modify the margins and drop the unnecessary label for the y-axis.
data_stacked_bar_chart %>% group_by(country_group, occupation) %>% summarise(total = sum(count)) %>% ungroup() %>% stacked_bar_chart(library = "plotly", stacking.type = "normal", categories.column = ~country_group, subcategories.column = ~occupation, value.column = ~total) %>% layout(margin = list(l = 150), yaxis = list(title = ""))
A good pattern to follow with stacked bar charts is to order the categories from "largest to smallest", i.e. the country_group with the most number of jobs at the top of the chart and the least at the bottom. This functionality is deliberately NOT provided by the stacked_bar_chart
function, instead you must modify the data:
descending_order_of_regions <- data_stacked_bar_chart %>% group_by(country_group, occupation) %>% summarise(total = sum(count)) %>% mutate(total.in.group = sum(total)) %>% arrange(desc(total.in.group)) %>% select(country_group) %>% unique() %>% .[[1]] data_stacked_bar_chart %>% group_by(country_group, occupation) %>% summarise(total = sum(count)) %>% ungroup() %>% stacked_bar_chart(library = "plotly", stacking.type = "normal", categories.column = ~country_group, categories.order = descending_order_of_regions, subcategories.column = ~occupation, value.column = ~total) %>% layout(margin = list(l = 150), yaxis = list(title = ""))
In addition to ordering the categories by descending size, it is useful to do the same for the subcategories:
descending_order_of_occupations <- data_stacked_bar_chart %>% group_by(country_group, occupation) %>% mutate(total = sum(count)) %>% mutate(total.in.group = sum(total)) %>% arrange(desc(total.in.group)) %>% ungroup() %>% select(occupation) %>% unique() %>% .[[1]] data_stacked_bar_chart %>% group_by(country_group, occupation) %>% summarise(total = sum(count)) %>% ungroup() %>% stacked_bar_chart(library = "highcharter", stacking.type = "normal", categories.column = ~country_group, categories.order = descending_order_of_regions, subcategories.column = ~occupation, subcategories.order = descending_order_of_occupations, value.column = ~total) %>% hc_tooltip(shared = TRUE)
The stacked_bar_chart
function automatically calculates within category percentages for subcategory values, for instance:
data_stacked_bar_chart %>% group_by(country_group, occupation) %>% summarise(total = sum(count)) %>% ungroup() %>% stacked_bar_chart(library = "highcharter", stacking.type = "percent", categories.column = ~country_group, categories.order = descending_order_of_regions, subcategories.column = ~occupation, subcategories.order = descending_order_of_occupations, value.column = ~total) %>% hc_tooltip(shared = TRUE)
data_stacked_bar_chart %>% group_by(country_group, occupation) %>% summarise(total = sum(count)) %>% ungroup() %>% stacked_bar_chart(library = "plotly", stacking.type = "percent", categories.column = ~country_group, categories.order = descending_order_of_regions, subcategories.column = ~occupation, subcategories.order = descending_order_of_occupations, value.column = ~total) %>% layout(margin = list(l = 150), yaxis = list(title = ""))
The tooltips for stacked barcharts often do not apply to all subcategories within a category by default, one must specify that the tooltip must be shared. Explicit customisation of the tooltip then requires some programming in JavaScript.
Here's a small example for highcharter:
my_stack <- data_frame( cat = rep(c("a","b","c","d", "e"), 5), subcat = rep(c("u", "x", "y", "z", "w"), each = 5), value = round(rnorm(25, mean = 40, sd = 10)) ) my_stack %>% stacked_bar_chart( library = "highcharter", categories.column = ~cat, subcategories.column = ~subcat, value.column = ~value ) %>% hc_plotOptions(series = list(stacking = "stack")) %>% hc_tooltip( formatter = JS("function(){ var subcat = ''; $.each(this.points,function(i, point){ subcat += '<b>' + this.point.series.name + ': <b>' + Highcharts.numberFormat(this.point.plotY, 1) + '<br/>'; }); return 'Category: ' + this.x.name + '<br/>' + subcat; }"), shared = TRUE )
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.