library(learnr) library(manynet) library(migraph) library(patchwork) clear_glossary() knitr::opts_chunk$set(echo = FALSE)
Network visualisation is important and non-trivial. As Tufte (1983: 9) said:
"At their best, graphics are instruments for reasoning about quantitative information. Often the most effective way to describe, explore, and summarize a set of numbers – even a very large set – is to look at pictures of those numbers"
All of this is crucial with networks. As a first step, network visualisation -- or graphing -- offers us a way to vet our data for anything strange that might be going on, both revealing and informing our assumptions and intuitions.
It is also crucial to the further communication of the lessons that we have learned through investigation with others. While there may be many dead-ends and time-sinks to visualisation, it is worth taking the time to make sure that your main points are easy to appreciate.
Brandes et al (1999) argue that visualising networks demands thinking about:
There are several main packages for plotting in R, as well as several for plotting networks in R. Plotting in R is typically based around two main approaches:
{ggplot2}
package.^['gg' stands for the Grammar of Graphics.]Approaches to plotting graphs or networks in R can be similarly divided:
{igraph}
and {sna}
,
both build upon the base R graphics engine,{ggnetwork}
and
{ggraph}
build upon a grid approach.^[
Others include: 'Networkly' for creating 2-D and 3-D interactive networks
that can be rendered with plotly and can be easily integrated into
shiny apps or markdown documents;
'visNetwork' interacts with javascript (vis.js) to make interactive networks
(http://datastorm-open.github.io/visNetwork/); and
'networkD3' interacts with javascript (D3) to make interactive networks
(https://www.r-bloggers.com/2016/10/network-visualization-part-6-d3-and-r-networkd3/).
]While the coercion routines available in {manynet}
make it easy to use
any of these packages' graphing capabilities,
{manynet}
itself builds upon the ggplot2/ggraph engine
to help quickly and informatively graph networks.
On her excellent and helpful website, Katya Ognyanova outlines some key dimensions of control that network researchers have to play with:
In the following sections,
we will learn how {manynet}
provides sensible defaults on many of these elements,
and offers ways to extend or customise them.
To get a basic visualisation of the network before adding various specifications,
the graphr()
function in {manynet}
is a quick and easy way
to obtain a clear first look of the network
for preliminary investigations and understanding of the network. Let's quickly visualise one of the ison_
datasets included in the package.
graphr(fict_lotr)
We can also specify the colours, groups, shapes, and sizes of nodes and edges in
the graphr()
function using the following parameters:
node_colour
node_shape
node_size
node_group
edge_color
edge_size
graphr(fict_lotr, node_color = "Race")
graphr()
is not the only graphing function included in {manynet}
.
To graph sets of networks together, graphs()
makes sure that two
or more networks are plotted together.
This might be a set of ego networks, subgraphs, or waves of a longitudinal network.
graphs(to_subgraphs(fict_lotr, "Race"), waves = c(1,2,3,4))
grapht()
is another option, rendering network changes as a gif.
fict_lotr %>% mutate_ties(year = sample(1:12, 66, replace = TRUE)) %>% to_waves(attribute = "year", cumulative = TRUE) %>% grapht()
Append ggtitle()
to add a title.
{manynet}
works well with both {ggplot2}
and {ggraph}
functions
that can be appended to create more tailored visualisations
of the network.
graphr(ison_adolescents) + ggtitle("Visualisation")
{manynet}
also uses the {patchwork}
package for arranging
graphs together, e.g. side-by-side or above one another.
The syntax is quite straight forward and is used throughout these vignettes.
graphr(ison_adolescents) + graphr(ison_algebra) graphr(ison_adolescents) / graphr(ison_algebra)
While {manynet}
attempts to provide legends where necessary,
in some cases the legends offer insufficient detail,
such as in the following figure, or are absent.
fict_lotr %>% mutate(maxbet = node_is_max(node_betweenness(fict_lotr))) %>% graphr(node_color = "maxbet")
{manynet}
supports the {ggplot2}
way of adding legends
after the main plot has been constructed,
using guides()
to add in the legends,
and labs()
for giving those legends particular titles.
Note that we can use "\n"
within the legend title to make the title
span multiple lines.
fict_lotr %>% mutate(maxbet = node_is_max(node_betweenness(fict_lotr))) %>% graphr(node_color = "maxbet") + guides(color = "legend") + labs(color = "Maximum\nBetweenness")
An alternative to colors and legends is to use the 'node_group' argument to highlight groups in a network. This works best for quite clustered distributions of attributes.
graphr(fict_lotr, node_group = "Race")
The aim of graph layouts is to position nodes in a (usually) two-dimensional space to maximise some analytic and aesthetically pleasing function. Quality measures might include:
A range of graph layouts are available across the {igraph}
, {graphlayouts}
, and {manynet}
packages
that can be used together with graphr()
.
Force-directed layouts updates some initial placement of vertices through the operation of some system of metaphorically-physical forces. These might include attractive and repulsive forces.
(graphr(ison_southern_women, layout = "kk") + ggtitle("Kamada-Kawai") | graphr(ison_southern_women, layout = "fr") + ggtitle("Fruchterman-Reingold") | graphr(ison_southern_women, layout = "stress") + ggtitle("Stress Minimisation"))
The Kamada-Kawai (KK) method inserts a spring between all pairs of vertices that is the length of the graph distance between them. This means that edges with a large weight will be longer. KK offers a good layout for lattice-like networks, because it will try to space the network out evenly.
The Fruchterman-Reingold (FR) method uses an attractive force between directly connected vertices, and a repulsive force between all vertex pairs. The attractive force is proportional to the edge's weight, thus edges with a large weight will be shorter. FR offers a good baseline for most types of networks.
The Stress Minimisation (stress) method is related to the KK algorithm,
but offers better runtime, quality, and stability and so is generally preferred.
Indeed, {manynet}
uses it as the default for most networks.
It has the advantage of returning the same layout each time it is run on the same network.
question("Can we interpret the distance between nodes in force-directed layouts conclusively?", answer("No", correct = TRUE, message = "That's right, they are illustrative and not to be used for hard conclusions."), answer("Yes"), allow_retry = TRUE )
Other force-directed layouts available include:
"dh"
"gem"
"graphopt"
"drl"
Layered layouts arrange nodes into horizontal (or vertical) layers, positioning them so that they reduce crossings. These layouts are best suited for directed acyclic graphs or similar.
graphr(ison_southern_women, layout = "bipartite") + ggtitle("Bipartite") graphr(ison_southern_women, layout = "hierarchy") + ggtitle("Hierarchy") graphr(ison_southern_women, layout = "railway") + ggtitle("Railway")
Note that "hierarchy"
and "railway"
use a different algorithm to
{igraph}
's "bipartite"
, and generally performs better,
especially where there are multiple layers.
Whereas "hierarchy"
tries to position nodes to minimise overlaps,
"railway"
sequences the nodes in each layer to a grid
so that nodes are matched as far as possible.
If you want to flip the horizontal and vertical,
you could flip the coordinates, or use something like the following layout.
graphr(ison_southern_women, layout = "alluvial") + ggtitle("Alluvial")
Other layered layouts include:
"tree"
Circular layouts arrange nodes around (potentially concentric) circles, such that crossings are minimised and adjacent nodes are located close together. In some cases, location or layer can be specified by attribute or mode.
graphr(ison_southern_women, layout = "concentric") + ggtitle("Concentric")
Other such layouts include:
"circle"
"sphere"
"star"
"linear"
Spectral layouts arrange nodes according to the eigenvalues of the Laplacian matrix of a graph. These layouts tend to exaggerate clustering of like-nodes and the separation of less similar nodes in two-dimensional space.
graphr(ison_southern_women, layout = "eigen") + ggtitle("Eigenvector")
Somewhat similar are multidimensional scaling techniques, that visualise the similarity between nodes in terms of their proximity in a two-dimensional (or more) space.
graphr(ison_southern_women, layout = "mds") + ggtitle("Multidimensional Scaling")
Other such layouts include:
"pmds"
question("Can we interpret the distance between nodes in spectral layouts?", answer("No"), answer("Yes", correct = TRUE, message = "That's right, though it is not always easy to do so..."), allow_retry = TRUE )
Grid layouts arrange nodes based on some Cartesian coordinates. These can be useful for making sure all nodes' labels are visible, but horizontal and vertical lines can overlap, making it difficult to distinguish whether some nodes are tied or not.
graphr(ison_southern_women, layout = "grid") + ggtitle("Grid")
Other grid layouts include:
By default, graphr()
will use a color palette that offers fairly good contrast
and, since v1.0.0 of {manynet}
, better accessibility.
However, a different hue might offer a better aesthetic or
identifiability for some nodes.
Because the graphr()
function is based on
the grammar of graphics,
it's easy to extend or alter aesthetic aspects.
Here let's try and change the colors
assigned to the different races in the fict_lotr
dataset.
graphr(fict_lotr, node_color = "Race") graphr(fict_lotr, node_color = "Race") + ggplot2::scale_colour_hue()
Other times color may not be desired.
Some publications require grayscale images.
To use a grayscale color palette,
replace _hue
from above with _grey
(note the 'e' spelling):
graphr(fict_lotr, node_color = "Race") + ggplot2::scale_colour_grey()
As you can see, grayscale is more effective for continuous variables or very few discrete variables than the number used here.
Or we may want to choose particular colors for each category.
This is pretty straightforward to do with ggplot2::scale_colour_manual()
.
Some common color names are available,
but otherwise hex color codes can be used for more specific colors.
Unspecified categories are coloured (dark) grey.
graphr(fict_lotr, node_color = "Race") + ggplot2::scale_colour_manual( values = c("Dwarf" = "red", "Hobbit" = "orange", "Maiar" = "#DEC20B", "Human" = "lightblue", "Elf" = "lightgreen", "Ent" = "darkgreen")) + labs(color = "Color")
Perhaps you are preparing a presentation,
representing your institution, department, or research centre at home or abroad.
In this case, you may wish to theme the whole network with institutional colors
and fonts.
Here we demonstrate one of the color scales available in {manynet}
,
the colors of the Sustainable Development Goals:
graphr(fict_lotr, node_color = "Race") + scale_color_sdgs()
More institutional scales and themes are available, and more can be implemented upon pull request.
For more flexibility with visualizations,
{manynet}
users are encouraged to use the excellent {ggraph}
package.
{ggraph}
is built upon the venerable {ggplot2}
package
and works with tbl_graph
and igraph
objects.
As with {ggplot2}
, {ggraph}
users are expected to build
a particular plot from the ground up,
adding explicit layers to visualise the nodes and edges.
library(ggraph) ggraph(fict_greys, layout = "fr") + geom_edge_link(edge_colour = "dark grey", arrow = arrow(angle = 45, length = unit(2, "mm"), type = "closed"), end_cap = circle(3, "mm")) + geom_node_point(size = 2.5, shape = 19, colour = "blue") + geom_node_text(aes(label=name), family = "serif", size = 2.5) + scale_edge_width(range = c(0.3,1.5)) + theme_graph() + theme(legend.position = "none")
As we can see in the code above, we can specify various aspects of the plot to tailor it to our network.
First, we can alter the layout of the network using the layout =
argument
to create a clearer visualisation of the ties between nodes.
This is especially important for larger networks, where nodes and ties are more
easily obscured or misrepresented.
In {ggraph}
, the default layout is the "stress" layout.
The "stress" layout is a safe choice because it is deterministic and
fits well with almost any graph, but it is also a good idea to explore and try
out other layouts on your data.
More layouts can be found in the {graphlayouts}
and {igraph}
R packages.
To use a layout from the {igraph}
package, enter only the last part of the layout
algorithm name (eg. layout = "mds"
for "layout_with_mds").
Second, using geom_node_point()
which draws the nodes as geometric shapes
(circles, squares, or triangles), we can specify the presentation of nodes
in the network in terms of their shape (shape=
, choose from 1 to 21),
size (size=
), or colour (colour=
). We can also use aes()
to match to
node attributes. To add labels, use geom_node_text()
or
geom_node_label()
(draws labels within a box). The font (family=
),
font size (size=
), and colour (colour=
) of the labels can be specified.
Third, we can also specify the presentation of edges in the network.
To draw edges, we use geom_edge_link0()
or geom_edge_link()
.
Using the latter function makes it possible to draw a straight line with a
gradient.
The following features can be tailored either globally or matched to specific
edge attributes using aes()
:
colour: edge_colour=
width: edge_width=
linetype: edge_linetype=
opacity: edge_alpha=
For directed graphs, arrows can be drawn using the arrow=
argument
and the arrow()
function from {ggplot2}
.
The angle, length, arrowhead type,
and padding between the arrowhead and the node can also be specified.
To change the position of the legend, add the theme()
function from {ggplot2}
.
The legend can be positioned at the top, bottom, left, or right,
or removed using "none".
For more see David Schoch's excellent resources on this.
We can print the plots we have made to PDF by point-and-click by selecting 'Save as PDF...' from under the 'Export' dropdown menu in the plots panel tab of RStudio.
If you want to do this programmatically, say because you want to record how you have saved it so that you can e.g. make some changes to the parameters at some point, this is also not too difficult.
After running the (gg-based) plot you want to save,
use the command ggsave("my_filename.pdf")
to save your plot
as a PDF to your working directory.
If you want to save it somewhere else, you will need to specify the file path
(or change the working directory, but that might be more cumbersome).
If you want to save it as a different filetype,
replace .pdf
with e.g. .png
or .jpeg
.
See ?ggsave
for more.
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.