In tkmckenzie/d3po: D3 Popular Outputs

knitr::opts_chunk$set(collapse = T,
                      comment = "#>",
                      fig.align = "center",
                      fig.width = 5)
options(tibble.print_min = 4L, tibble.print_max = 4L)
library(d3po)
set.seed(123)

The r2d3 package provides a general framework with which R objects can be used to produce D3 documents. While the package provides very general capabilities, there are a handful of commonly used D3 visualizations that users must build by hand. The d3po (D3 Popular Outputs) package provides some of these common visualizations in an easier-to-use format, leveraging r2d3 to produce the D3 documents. This vignette demonstrates each of the visualizations included in d3po.

Diagrams

Chord diagrams

Chord diagrams are often used to show relationships within a set, with strength of relationships represented by size of the chord. The d3po::chord function allows relationships to be specified either in an edgelist or as an adjacency matrix. Symmetric adjacency matrices with diagonal of zeros (or equivalent edgelists) are advised as they are more interpretable, but those characteristics are not required.

labels = letters[1:3]

adjacency.matrix = matrix(c(0, 1, 2,
                            1, 0, 3,
                            2, 3, 0),
                          nrow = 3, byrow = TRUE)

chord(adjacency.matrix = adjacency.matrix, labels = labels)

Choropleths

The d3po package includes the ability to make choropleth maps at the U.S. county or state level. The vector c("District of Columbia", datasets::state.name) can be used to construct state-level data, and d3po::us.counties can be used to construct county-level data.

require(dplyr)

df.state = data.frame(state = c("District of Columbia", state.name),
                      value.state = rnorm(51, sd = 3))
df.county = us.counties %>%
  mutate(value.county = rnorm(n(), sd = 1)) %>%
  left_join(df.state, by = "state") %>%
  mutate(value = value.state + value.county)

choropleth.state(df.state, value.column = "value.state")
choropleth.county(df.county)

Word Clouds

Word clouds can be created with d3po::cloud by supplying a list of words and relative sizes. Words can also be individually colored or colored by group.

require(dplyr)
require(rcorpora)

words = corpora("words/oprah_quotes")$oprahQuotes
words = tolower(unlist(strsplit(words, " ")))
words = gsub("[.,-;]+", "", words)
words = gsub("’", "'", words)
words = words[nchar(words) > 0]

stopwords = corpora("words/stopwords/en")$stopWords

word.df = data.frame(text = words)
word.df = word.df %>%
  filter(!(text %in% stopwords)) %>%
  group_by(text) %>%
  summarize(value = n(), .groups = "drop") %>%
  arrange(desc(value)) %>%
  slice(1:25) %>%
  mutate(value = 3 * value) # To make words a little larger in final image

cloud(word.df, text.color = "word", color.scheme = "RdYlGn")

Marimekko Charts

Marimekko charts are used to show both proportions amongst groups and total numbers relative to the overall population. These charts are especially useful for comparing similar variables across groups to show how those variables change proportional to both group size and the total population.

num.groups = 5
num.vars = 3
df = data.frame(group = rep(LETTERS[1:num.groups], each = num.vars),
                var = rep(letters[1:num.vars], times = num.groups),
                value = runif(num.groups * num.vars, 0, 10))
marimekko(df, x.column = "group", y.column = "var", color.scheme = "Plasma")

Sankey Diagrams

Sankey diagrams are useful for showing relationships between sets, with strength of relationships represented by size of the link. The function d3po::sankey creates a Sankey diagram from an edgelist.

num.targets = 5
num.sources = 3
df = data.frame(source = rep(LETTERS[1:num.sources], each = num.targets),
                target = rep(letters[1:num.targets], times = num.sources),
                value = runif(num.groups * num.vars, 0, 10))
sankey(df)

Sunburst Diagrams

Sunburst diagrams can be used to show how individual pieces contribute to the whole through a hierarchy. The d3po::sunburst function creates a sunburst diagram from an ancestry data.frame describing links of the hierarchy and a leaf value data.frame giving values for nodes without children.

num.children = 15
num.parents = 3
ancestry.df = data.frame(parent = sample(LETTERS[1:num.parents], num.children, replace = TRUE),
                         child = letters[1:num.children])
ancestry.df = rbind(data.frame(parent = "flare", child = LETTERS[1:num.parents]), ancestry.df) # Need to specify head nodes
leaf.df = data.frame(id = letters[1:num.children], value = runif(num.children, 0, 10))

sunburst(ancestry.df, leaf.df)