if (!requireNamespace("dplyr", quietly = TRUE)) {
    stop("Package \"dplyr\" needed to build the vignette. Please install it.",
      call. = FALSE)
}

Introduction

The chorddiag package allows to create interactive chord diagrams using the JavaScript visualization library D3 (http://d3js.org) from within R via the htmlwidgets interfacing framework.

In short, chord diagrams show directed relationships among a group of entities. The chord diagram layout is explained by Mike Bostock, the creator of D3, in more detail here: https://github.com/mbostock/d3/wiki/Chord-Layout.

To quote the explanation found there:

"Consider a hypothetical population of people with different hair colors: black, blonde, brown and red. Each person in this population has a preferred hair color for a dating partner; of the 29,630 (hypothetical) people with black hair, 40% (11,975) prefer partners with the same hair color. This preference is asymmetric: for example, only 10% of people with blonde hair prefer black hair, while 20% of people with black hair prefer blonde hair. A chord diagram visualizes these relationships by drawing quadratic Bézier curves between arcs. The source and target arcs represents two mirrored subsets of the total population, such as the number of people with black hair that prefer blonde hair, and the number of people with blonde hair that prefer black hair." (Mike Bostock)

The package's JavaScript code is based on http://bl.ocks.org/mbostock/4062006, with modifications for fading behaviour and addition of tooltips.

Installation

The package is available from github and can be installed with

devtools::install_github("mattflor/chorddiag")

(you obviously need the devtools package for this).

After installation, the package is loaded via

library(chorddiag)

Examples

Hair Color Preference

To create a chord diagram for the hair color preference example stated in the introduction, we need the preferences in matrix format:

m <- matrix(c(11975,  5871, 8916, 2868,
              1951, 10048, 2060, 6171,
              8010, 16145, 8090, 8045,
              1013,   990,  940, 6907),
            byrow = TRUE,
            nrow = 4, ncol = 4)
haircolors <- c("black", "blonde", "brown", "red")
dimnames(m) <- list(have = haircolors,
                    prefer = haircolors)
print(m)

Then, we can pass this matrix to the chorddiag function to create the chord diagram:

chorddiag(m)

Default chord diagram for the hair color preference dataset. Note that all images in this vignette are static. When generated by the chorddiag function, the diagrams will be interactive. This includes chords fading, tooltips, and resizing.

The chord diagram can be customized easily. Here, we call the function with custom colors and provide some padding to avoid group names overlapping with tick labels:

groupColors <- c("#000000", "#FFDD89", "#957244", "#F26223")
chorddiag(m, groupColors = groupColors, groupnamePadding = 20)

Customized chord diagram for the hair color preference dataset, using custom colors and more padding between the diagram and group labels to avoid overlap with tick labels.

Interactive chord diagram refers to chord fading and tooltip popups on certain mouse over events. E.g. if the mouse pointer hovers over the chord connecting the "blonde" and "red" groups, a tooltip is displayed giving the numbers for the chord, and all other chords fade away. Or, when hovering over a group arc, all chords *not * belonging to that group fade away, and a tooltip displays summarized group information. Fading levels can be set, and tooltip layout can be customized to some degree as well; for details, see the chorddiag function's documentation.

Tooltip and chord fading showcase for the hair color preference chord diagram. In this case, we can see that a considerable fraction of blonde people prefer red-haired dating partners whereas only a small fraction of red-haired people prefer blonde partners.

Titanic Survival (Bipartite Chord Diagram)

The default chord diagram type is directional, allowing for visualization of asymmetric relationships. But chord diagrams can also be a useful visualization of frequency distributions for two categories of groups, in other word contingency tables (or cross tabulations or crosstabs). In this package, this type of chord diagram is called bipartite (because there are only chords between categories but not within categories).

Here is an example for the Titanic dataset. First, we create a contingency table of how many passengers from the different classes and from the crew survived or died when the Titanic sunk.

library(dplyr)
titanic_tbl <- tibble::as_tibble(Titanic)
titanic_tbl <- titanic_tbl %>%
    mutate(across(where(is.character), as.factor))
by_class_survival <- titanic_tbl %>%
    group_by(Class, Survived) %>%
    summarise(Count = sum(n)) %>% 
    ungroup()
titanic.mat <- matrix(by_class_survival$Count, nrow = 4, ncol = 2, byrow = TRUE)
dimnames(titanic.mat ) <- list(Class = levels(titanic_tbl$Class),
                               Survival = levels(titanic_tbl$Survived))
print(titanic.mat)

Note that we labeled the dimensions of the matrix by assigning a named list to dimnames. The dimension labels (here: "Class" and "Survival") will automatically be used in the chord diagram.

We can create a "bipartite" chord diagram for this matrix by setting type = "bipartite".

groupColors <- c("#2171b5", "#6baed6", "#bdd7e7", "#bababa", "#d7191c", "#1a9641")
chorddiag(titanic.mat, type = "bipartite", 
          groupColors = groupColors,
          tickInterval = 50)

A bipartite chord diagram visualizing survival grouped by class / crew for the Titanic data.



mattflor/chorddiag documentation built on Aug. 10, 2020, 12:46 a.m.