knitr::opts_chunk$set(collapse = T, comment = "#>")
library(igraph)
library(arcdiagram)
library(dplyr)
# read 'gml' file
mis_file = "../lesmiserables.gml"
mis_graph = read.graph(mis_file, format="gml")

# get vertex labels
vlabels = get.vertex.attribute(mis_graph, "label")

# get vertex groups
vgroups = get.vertex.attribute(mis_graph, "group")

# get vertex fill color
vfill = get.vertex.attribute(mis_graph, "fill")

# get vertex border color
vborders = get.vertex.attribute(mis_graph, "border")

# get vertex degree
degrees = degree(mis_graph)

# get edges value
values = get.edge.attribute(mis_graph, "value")

# get edgelist
edgelist = get.edgelist(mis_graph)

# groups and degree
x = data.frame(vgroups, degrees, vlabels, ind=1:vcount(mis_graph))
y = arrange(x, desc(vgroups), desc(degrees))
new_ord = y$ind
# plot
op = par(mar = c(6, 0, 1, 0))
arcplot(edgelist, ordering=new_ord, labels=vlabels, cex.labels=0.8,
        show.nodes=TRUE, col.nodes=vborders, bg.nodes=vfill, 
        cex.nodes = log(degrees)+0.5, pch.nodes=21,
        lwd.nodes = 2, line=-0.5, 
        col.arcs = hsv(0, 0, 0.2, 0.25), lwd.arcs = 1.5 * values)
par(op)

Introduction

This document describes the required steps that you'll need to follow to get an arc diagram like the one from Les Miserables with the R package "arcdiagram" (a minimalist package designed for plotting pretty arc diagrams).

Les Miserables

The file for this example is lesmiserables.gml which is available in the github repository of the "arcdiagram" package:

https://github.com/gastonstat/arcdiagram/lesmiserables.gml

This file is a text file with GML format, which is just a type of format for graphs. You can find more information about GML format at:

http://en.wikipedia.org/wiki/Graph_Modelling_Language

Step 1: Read data in R

I'm assuming that you already checked the introductory documentation of the "arcdiagram". After downloading the gml file, you will have to import it in R using the function read.graph() with the argument format = "gml". I assume that the .gml file is in your working directory:

# load 'arcdiagram'
library(arcdiagram)

# read 'gml' file
mis_graph = read.graph("lesmiserables.gml", format = "gml")

Step 2: Extract edge list

Since we will use the function arcplot(), we need an edgelist. The good news is that we can use the function get.edgelist() to extract it from mis_graph:

# get edgelist
edgelist = get.edgelist(mis_graph)

Once we have the edgelist, we can try to get a first ---very raw--- arc diagram with arcplot():

# first plot
op = par(mar = c(2, 0.5, 1, 0.5))
arcplot(edgelist)
par(op)

You can see from the previous figure that our first arc diagram has nothing to do with what we are looking for. A better approximation can be obtained if we start tweaking some of the parameters like the symbols of the nodes, the color of the arcs, and their line widths:

# second plot
op = par(mar = c(2, 0.5, 1, 0.5))
arcplot(edgelist, cex.labels=0.8,
        show.nodes=TRUE, lwd.nodes = 2, line=-0.5, 
        col.arcs = hsv(0, 0, 0.2, 0.25), lwd.arcs = 1.5)
par(op)

Step 3: Information about nodes and edges

Most of the necessary ingredients to create our pretty arc diagram are contained in the graph object mis_graph: the fill color of the nodes, the border color of the nodes, the group memberships, the node labels, and the arc widths. If you print mis_graph you will see the following output:

# what's in mis_graph
mis_graph

The first line tells you that mis_graph is an undirected graph with 77 nodes and 254 edges (U--- 77 254 --). The second and third lines indicate that mis_graph has the following attributes (attr):

To extract all the data attributes associated with the nodes in the mis_graph we have to use the functions get.vertex.attribute() and get.edge.attribute():

# get vertex labels
vlabels = get.vertex.attribute(mis_graph, "label")

# get vertex groups
vgroups = get.vertex.attribute(mis_graph, "group")

# get vertex fill color
vfill = get.vertex.attribute(mis_graph, "fill")

# get vertex border color
vborders = get.vertex.attribute(mis_graph, "border")

# get edges value
values = get.edge.attribute(mis_graph, "value")

In addition to the node (i.e. vertices) attributes, we also need to get the degree of the nodes by using the function degree():

# get vertex degree
degrees = degree(mis_graph)
options(width = 60)

Ok, let's try a third plot attempt:

# third plot
op = par(mar = c(5, 0, 1, 0))
arcplot(edgelist, labels=vlabels, cex.labels=0.8,
        show.nodes=TRUE, col.nodes=vborders, bg.nodes=vfill, 
        cex.nodes = log(degrees)+0.5, pch.nodes=21,
        lwd.nodes = 2, line=-0.5, 
        col.arcs = hsv(0, 0, 0.2, 0.25), lwd.arcs = 1.5 * values)
par(op)

Step 4: Nodes Ordering

We are very close to our objective but we still need the right ordering for the nodes. One option to get the nodes ordering is by using the package dplyr (by Hadley Wickham):

# if you haven't installed it
install.packages("dplyr")

# load 'dplyr'
library(dplyr)

The idea is to create a data frame with the following variables: vgroups, degrees, vlabels, and a numeric index for the nodes ind.

# data frame with node attributes
x = data.frame(vgroups, degrees, vlabels, ind=1:vcount(mis_graph))

# take a peek to the data frame
head(x)

We will arrange the data frame in descending order, first by vgroups and then by degrees; what we want is the sorted ind:

# arrange by groups and degree
y = arrange(x, desc(vgroups), desc(degrees))

# what does 'y' look like?
head(y)

# get 'ind' ordering
new_ord = y$ind

Step 5: Final plot

Now we are ready to produce the desired arc diagram:

# plot
op = par(mar = c(6, 0, 1, 0))
arcplot(edgelist, ordering=new_ord, labels=vlabels, cex.labels=0.8,
        show.nodes=TRUE, col.nodes=vborders, bg.nodes=vfill, 
        cex.nodes = log(degrees)+0.5, pch.nodes=21,
        lwd.nodes = 2, line=0, 
        col.arcs = hsv(0, 0, 0.2, 0.25), lwd.arcs = 1.5 * values)
par(op)


gastonstat/arcdiagram documentation built on April 8, 2022, 5:59 a.m.