Centrality
In manynet: Many Ways to Make, Modify, Map, Mark, and Measure Myriad Networks

library(learnr)
library(manynet)
library(patchwork)
knitr::opts_chunk$set(echo = FALSE)
ison_brandes2 <- ison_brandes %>% rename(type = twomode_type)

Today's target

The aim of this tutorial is to show how we can measure and map degree, betweenness, closeness, eigenvector, and other types of centrality, explore their distributions and calculate the corresponding centralisation.

Setting up

For this exercise, we'll use the ison_brandes dataset in {manynet}. This dataset is in a 'tidygraph' format, but manynet makes it easy to coerce this into other forms to be compatible with other packages. We can create a two-mode version of the dataset by renaming the nodal attribute "twomode_type" to just "type". Let's begin by graphing these datasets using graphr().

# Let's graph the one-mode version
graphr(____)

# Now, let's create a two-mode version 'ison_brandes2' and graph it.
ison_brandes2 <- ison_brandes %>% rename(type = twomode_type)
graphr(____)

# plot the one-mode version
graphr(ison_brandes)
ison_brandes2 <- ison_brandes %>% rename(type = twomode_type)
# plot the two-mode version
graphr(ison_brandes2, layout = "bipartite")

The network is anonymous, but I think it would be nice to add some names, even if it's just pretend. Luckily, {manynet} has a function for this: to_named(). This makes plotting the network just a wee bit more accessible and interpretable:

ison_brandes <- to_named(ison_brandes)

# Now, let's graph using the object names: "ison_brandes"
graphr(____)

ison_brandes <- to_named(ison_brandes)
# plot network with names
graphr(ison_brandes)

Note that you will likely get a different set of names (though still alphabetical), as they are assigned randomly from a pool of (American) first names.

Degree centrality

Let's start with calculating degree. Remember that degree centrality is just the number of incident edges/ties to each node. It is therefore easy to calculate yourself. Just sum the rows or columns of the matrix!

# We can calculate degree centrality like this:
(mat <- as_matrix(ison_brandes))
(degrees <- rowSums(mat))
rowSums(mat) == colSums(mat)

# Or by using a built in command in manynet like this:
node_degree(ison_brandes, normalized = FALSE)

# manually calculate degree centrality
mat <- as_matrix(ison_brandes)
degrees <- rowSums(mat)
rowSums(mat) == colSums(mat)
# You can also just use a built in command in manynet though:
node_degree(ison_brandes, normalized = FALSE)

question("Are the row sums the same as the column sums?",
  answer("Yes",
         correct = TRUE,
         message = "That's right, that's because this is an undirected network."),
  answer("No"),
  allow_retry = FALSE
)

question("In what ways are higher degree nodes more 'central'?",
         answer("They have more power than other nodes",
                message = "Not necessarily"),
         answer("They are more active than other nodes",
                correct = TRUE, message = learnr::random_praise()),
         answer("They would be located in the centre of the graph",
                message = "Not necessarily"),
         answer("They would be able to control information flow in the network",
                message = "Not necessarily"),
         allow_retry = TRUE,
         random_answer_order = TRUE)

Degree distributions

Often we are interested in the distribution of (degree) centrality in a network. {manynet} offers a way to get a pretty good first look at this distribution, though there are more elaborate ways to do this in base and grid graphics.

# distribution of degree centrality scores of nodes
plot(node_degree(ison_brandes))

What's plotted here by default is both the degree distribution as a histogram, as well as a density plot overlaid on it in red.

question("What kind of shape does this degree distribution have?",
         answer("Exponential"),
         answer("Hard to tell",
                correct = TRUE, message = learnr::random_praise()),
         answer("Uniform"),
         answer("Normal"),
         allow_retry = TRUE,
         random_answer_order = TRUE)

Other centralities

Other measures of centrality can be a little trickier to calculate by hand. Fortunately, we can use functions from {manynet} to help calculate the betweenness, closeness, and eigenvector centralities for each node in the network. Let's collect the vectors of these centralities for the ison_brandes dataset:

# Use the node_betweenness() function to calculate the
# betweenness centralities of nodes in a network
node_betweenness(ison_brandes)

# Use the node_closeness() function to calculate the 
# closeness centrality of nodes in a network
node_closeness(ison_brandes)

# Use the node_eigenvector() function to calculate 
# the eigenvector centrality of nodes in a network
node_eigenvector(ison_brandes)

node_betweenness(ison_brandes)
node_closeness(ison_brandes)
node_eigenvector(ison_brandes)

What is returned here are vectors of betweenness, closeness, and eigenvector scores for the nodes in the network. But what do they mean?

question("Why does a node with the smallest sum of geodesic distances to all other nodes be 'central'?",
         answer("They have more power than other nodes",
                message = "Not necessarily"),
         answer("They can distribute a message to all other nodes in the network most quickly",
                correct = TRUE, message = learnr::random_praise()),
         answer("They would be located in the centre of the graph",
                message = "Not necessarily"),
         answer("They would be able to control information flow in the network",
                message = "Not necessarily"),
         allow_retry = TRUE,
         random_answer_order = TRUE)

question("Why would a node lying 'between' many other nodes be 'central'?",
         answer("They have more power than other nodes",
                message = "Not necessarily"),
         answer("They would be able to control information flow in the network",
                correct = TRUE, message = learnr::random_praise()),
         answer("They would be located in the centre of the graph",
                message = "Not necessarily"),
         answer("They can distribute a message to all other nodes in the network most quickly",
                message = "Not necessarily"),
         allow_retry = TRUE,
         random_answer_order = TRUE)

Try to answer the following questions for yourself:

what does Bonacich mean when he says that power and influence are not the same thing?
can you think of a real-world example when an actor might be central but not powerful, or powerful but not central?

Note that all centrality measures in {manynet} return normalized scores by default -- for the raw scores, include normalized = FALSE in the function as an extra argument.

Now, can you create degree distributions for each of these?

plot(node_betweenness(ison_brandes))
plot(node_closeness(ison_brandes))
plot(node_eigenvector(ison_brandes))

Plotting centrality

It is straightforward in {manynet} to highlight nodes and ties with maximum or minimum (e.g. degree) scores. If the vector is numeric (i.e. a "measure"), then this can be easily converted into a logical vector that identifies the node/tie with the maximum/minimum score using e.g. node_is_max() or tie_is_min(). By passing this attribute to the graphr() argument "node_color" we can highlight which node or nodes hold the maximum score in red.

# plot the network, highlighting the node with the highest centrality score with a different colour
ison_brandes %>%
  mutate_nodes(color = node_is_max(node_degree())) %>%
  graphr(node_color = "color")

ison_brandes %>%
  mutate_nodes(color = node_is_max(node_betweenness())) %>%
  graphr(node_color = "color")

ison_brandes %>%
  mutate_nodes(color = node_is_max(node_closeness())) %>%
  graphr(node_color = "color")

ison_brandes %>%
  mutate_nodes(color = node_is_max(node_eigenvector())) %>%
  graphr(node_color = "color")

How neat! Try it with the two-mode version. What can you see?

# Instead of "ison_brandes", use "ison_brandes2"

ison_brandes2 %>%
  add_node_attribute("color", node_is_max(node_degree(ison_brandes2))) %>%
  graphr(node_color = "color", layout = "bipartite")

ison_brandes2 %>%
  add_node_attribute("color", node_is_max(node_betweenness(ison_brandes2))) %>%
  graphr(node_color = "color", layout = "bipartite")

ison_brandes2 %>%
  add_node_attribute("color", node_is_max(node_closeness(ison_brandes2))) %>%
  graphr(node_color = "color", layout = "bipartite")

ison_brandes2 %>%
  add_node_attribute("color", node_is_max(node_eigenvector(ison_brandes2))) %>%
  graphr(node_color = "color", layout = "bipartite")

question("Select all that are true for the two-mode Brandes network.",
         answer("Only one node is selected in each plot."),
         answer("The maximum degree square has a higher degree than the maximum degree circle(s).",
                correct = TRUE),
         answer("No node is ever the most central according to two or more different centrality measures."),
         allow_retry = TRUE,
         random_answer_order = TRUE)

Centralization

{manynet} also implements network centralization functions. Here we are no longer interested in the level of the node, but in the level of the whole network, so the syntax replaces node_ with net_:

net_degree(ison_brandes)
net_betweenness(ison_brandes)
net_closeness(ison_brandes)
print(net_eigenvector(ison_brandes), digits = 5)

By default, scores are printed up to 3 decimal places, but this can be modified and, in any case, the unrounded values are retained internally. This means that even if rounded values are printed (to respect console space), as much precision as is available is used in further calculations.

Note that for centralization in two-mode networks, two values are given (as a named vector), since normalization typically depends on the (asymmetric) number of nodes in each mode.

What if we want to have a single image/figure with multiple plots? This can be a little tricky with gg-based plots, but fortunately the {patchwork} package is here to help.

ison_brandes <- ison_brandes %>%
  add_node_attribute("degree", node_is_max(node_degree(ison_brandes))) %>%
  add_node_attribute("betweenness", node_is_max(node_betweenness(ison_brandes))) %>%
  add_node_attribute("closeness", node_is_max(node_closeness(ison_brandes))) %>%
  add_node_attribute("eigenvector", node_is_max(node_eigenvector(ison_brandes)))
gd <- graphr(ison_brandes, node_color = "degree") + 
  ggtitle("Degree", subtitle = round(net_degree(ison_brandes), 2))
gc <- graphr(ison_brandes, node_color = "closeness") + 
  ggtitle("Closeness", subtitle = round(net_closeness(ison_brandes), 2))
gb <- graphr(ison_brandes, node_color = "betweenness") + 
  ggtitle("Betweenness", subtitle = round(net_betweenness(ison_brandes), 2))
ge <- graphr(ison_brandes, node_color = "eigenvector") + 
  ggtitle("Eigenvector", subtitle = round(net_eigenvector(ison_brandes), 2))
(gd | gb) / (gc | ge)
# ggsave("brandes-centralities.pdf")

question("How centralized is the ison_brandes network? Select all that apply.",
         answer("It is more degree centralised than betweenness centralised.",
                message = "Degree centralisation is at 0.18 for this network whereas betweenness centralisation is at 0.32. In other words, the network is better characterised as having 1 or 2 nodes lying on the shortest paths between others than one where 1 or 2 nodes have many more ties than the others."),
         answer("It is more closeness centralised than betweenness centralised.",
                message = "Closeness centralisation is at 0.23 for this network whereas betweenness centralisation is at 0.32. In other words, the network is better characterised as having 1 or 2 nodes lying on the shortest paths between others than one where 1 or 2 nodes can reach or access most other nodes."),
         answer("It is more eigenvector centralised than betweenness centralised.",
                correct = TRUE,
                message = "That's right, eigenvector centralisation is at 0.48 for this network whereas betweenness centralisation is at 0.32. In other words, the network is better characterised as having a core (or cores) of well-connected nodes rather than a wide network with only 1 or 2 nodes lying on the shortest paths between others."),
         random_answer_order = TRUE, 
         allow_retry = TRUE)

question("What is the difference between centrality and centralisation according to the literature?",
  answer("Centrality is for nodes and centralisation is for networks",
         correct = TRUE),
  answer("Centrality is a state and centralisation is a process"),
  answer("Centrality is a ity and centralisation is a sation"),
  answer("Centrality is to centralisation what polarity is to polarisation"),
  allow_retry = FALSE,
  random_answer_order = TRUE
)

Free play

Choose another dataset included in {manynet}. Name a plausible research question you could ask of the dataset relating to each of the four main centrality measures (degree, betweenness, closeness, eigenvector). Plot the network with nodes sized by each centrality measure, using titles or subtitles to record the question and/or centralization measure.