Cohesion and Community
In manynet: Many Ways to Make, Modify, Map, Mark, and Measure Myriad Networks

library(learnr)
library(patchwork)
library(manynet)
knitr::opts_chunk$set(echo = FALSE)
clear_glossary()

friends <- to_uniplex(ison_algebra, "friends")
social <- to_uniplex(ison_algebra, "social")
tasks <- to_uniplex(ison_algebra, "tasks")

Setting up

In the last tutorial, we looked at network centrality and centralisation. Nodal centrality is often thought to be an expression of structural inequality. But nodes don't always seek to differentiate themselves in a network. Sometimes they are interested in being part of a group. In this tutorial, we're going to consider various ways in which 'groupness' might be examined.

gif of man not knowing how to study

The data we're going to use here, "ison_algebra", is included in the {manynet} package. Do you remember how to call the data? Can you find out some more information about it?

# Let's call and load the 'ison_algebra' dataset
data("ison_algebra", package = "manynet")
# Or you can retrieve like this:
ison_algebra <- manynet::ison_algebra

# If you want to learn more about the 'ison_algebra' dataset, use the following function (below)
?manynet::ison_algebra

data("ison_algebra", package = "manynet")
?manynet::ison_algebra
# If you want to see the network object, you can run the name of the object
ison_algebra
# or print the code with brackets at the front and end of the code
(ison_algebra <- manynet::ison_algebra)

We can see after printing the object that the dataset is multiplex, meaning that it contains several different types of ties: friendship (friends), social (social) and task interactions (tasks).

Adding names

The network is also anonymous, but I think it would be nice to add some names, even if it's just pretend. Luckily, {manynet} has a function for this, to_named(). This makes plotting the network just a wee bit more accessible and interpretable. Let's try adding names and graphing the network now:

ison_algebra <- to_named(ison_algebra)

graphr(ison_algebra)

ison_algebra <- to_named(ison_algebra)
graphr(ison_algebra)

Note that you will likely get a different set of names, as they are assigned randomly from a pool of (American) first names.

Separating multiplex networks

As a multiplex network, there are actually three different types of ties (friends, social, and tasks) in this network. We can extract them and graph them separately using to_uniplex():

# to_uniplex extracts ties of a single type,
# focusing on the 'friends' tie attribute here
friends <- to_uniplex(ison_algebra, "friends")
gfriend <- graphr(friends) + ggtitle("Friendship")

# now let's focus on the 'social' tie attribute
social <- to_uniplex(ison_algebra, "social")
gsocial <- graphr(social) + ggtitle("Social")

# and the 'tasks' tie attribute
tasks <- to_uniplex(ison_algebra, "tasks")
gtask <- graphr(tasks) + ggtitle("Task")

# now, let's compare each attribute's graph, side-by-side
gfriend + gsocial + gtask
# if you get an error here, you may need to install and load
# the package 'patchwork'.
# It's highly recommended for assembling multiple plots together.
# Otherwise you can just plot them separately on different lines.

friends <- to_uniplex(ison_algebra, "friends")
gfriend <- graphr(friends) + ggtitle("Friendship")

social <- to_uniplex(ison_algebra, "social")
gsocial <- graphr(social) + ggtitle("Social")

tasks <- to_uniplex(ison_algebra, "tasks")
gtask <- graphr(tasks) + ggtitle("Task")

# We now have three separate networks depicting each type of tie from the ison_algebra network:
gfriend + gsocial + gtask

Note also that these are weighted networks. graphr() automatically recognises these different weights and plots them. Where useful (less dense directed networks), graphr() also bends reciprocated arcs. What (else) can we say about these three networks?

Cohesion

Let's concentrate on the task network for now and calculate a few basic measures of cohesion: density, reciprocity, transitivity, and components.

Density

Density represents a generalised measure of cohesion, characterising how cohesive the network is in terms of how many potential ties (i.e. dyads) are actualised. Recall that there are different equations depending on the type of network. Below are three equations:

$$A: \frac{|T|}{|N|(|N|-1)}$$ $$B: \frac{2|T|}{|N|(|N|-1)}$$ $$C: \frac{|T|}{|N||M|}$$

where $|T|$ is the number of ties in the network, and $|N|$ and $|M|$ are the number of nodes in the first and second mode respectively.

question("Which equation is used for measuring density for a directed network:",
         answer("A",
                correct = TRUE,
                message = '<img src="https://i.giphy.com/media/v1.Y2lkPTc5MGI3NjExb2M5cTNmZm1kdmszdmR6Yzc3MnczcWxvb3MzdHRic2N0ZncwYml2MSZlcD12MV9pbnRlcm5hbF9naWZfYnlfaWQmY3Q9Zw/YcMs3OGd89Pxu/giphy.gif" alt="gif of a thumbs up"/>'
),
         answer("B",
                message = "This is the equation for an undirected network."),
         answer("C",
                message = "This is the equation for a two-mode network."),
        allow_retry = TRUE
)

One can calculate the density of the network using the number of nodes and the number of ties using the functions net_nodes() and net_ties(), respectively:

# calculating network density manually according to equation
net_ties(tasks)/(net_nodes(tasks)*(net_nodes(tasks)-1))

but we can also just use the {manynet} function for calculating the density, which always uses the equation appropriate for the type of network...

net_density(tasks)

Note that the various measures in {manynet} print results to three decimal points by default, but the underlying result retains the same recurrence. So same result...

question("Is this network's density high or low in absolute terms?",
         answer("High",
                message = "The closer the value is to 1, the more dense the network and the more cohesive the network is as a whole."),
         answer("Low",
                correct = TRUE,
                message = 'The closer the value is to 0, the sparser the network and the less cohesive the network is as a whole. But this is still quite typical density for a relatively small, social network like this one.  <br><br>    <img src="https://i.giphy.com/media/v1.Y2lkPTc5MGI3NjExOWs4Y292OXg5YmlvNTF3OWp4YWxmNjNpdm11c2kyMm55Z2U4eTJkeiZlcD12MV9pbnRlcm5hbF9naWZfYnlfaWQmY3Q9Zw/6agGMtS4rEVSo/giphy.gif" alt="gif of i knew it"/>'),
        allow_retry = TRUE
)

Density offers an important baseline measure for characterising the network as a whole.

Closure

In this section we're going to move from generalised measures of cohesion, like density, to more localised measures of cohesion. These are not only measures of cohesion though, but are also often associated with certain mechanisms of closure. Closure involves ties being more likely because other ties are present. There are two common examples of this in the literature: gloss("reciprocity"), where a directed tie is often likely to prompt a reciprocating tie, and gloss("transitivity"), where a directed two-path is likely to be shortened by an additional arc connecting the first and third nodes on that path.

gif of ah ha gotcha

Reciprocity

First, let's calculate r gloss("reciprocity") in the task network. While one could do this by hand, it's more efficient to do this using the {manynet} package. Can you guess the correct name of the function?

net_reciprocity(tasks)
# this function calculates the amount of reciprocity in the whole network

Wow, this seems quite high based on what we observed visually! But if we look closer, this makes sense. We can use tie_is_reciprocated() to identify those ties that are reciprocated and not.

tasks %>% mutate_ties(rec = tie_is_reciprocated(tasks)) %>% graphr(edge_color = "rec")
net_indegree(tasks)

So we can see that indeed there are very few asymmetric ties, and yet node 16 is both the sender and receiver of most of the task activity. So our reciprocity measure has taught us something about this network that might not have been obvious visually.

Transitivity

And let's calculate r gloss("transitivity") in the task network. Again, can you guess the correct name of this function?

net_transitivity(tasks)
# this function calculates the amount of transitivity in the whole network

question("What can we say about task closure in this network? Choose all that apply.",
  answer("Transitivity for the task network is 0.568",
         correct = TRUE,
         message = '<img src="https://i.giphy.com/media/v1.Y2lkPTc5MGI3NjExZXBpNDlsZzNraTJ1bHN6MW1scjA4aDVzdXo1NW11bGI5bWF0dzYxMyZlcD12MV9pbnRlcm5hbF9naWZfYnlfaWQmY3Q9Zw/Hg7gPa05K6bAc/giphy.gif" alt="gif of cool cool cool"/>'),
  answer("Transitivity for the task network is -0.568", 
         message = "Transivitity must be between 0 and 1."),
  answer("Transitivity is quite low in this network", 
         message = "Transitivity is often around 0.3 in most social networks."),
  answer("Transitivity is quite high in this network", 
         correct = TRUE),
  answer("Transitivity is likely higher in the task network than the friendship network", 
         correct = TRUE),
  random_answer_order = TRUE,
  allow_retry = TRUE
)

Projection

A two-mode network

The next dataset, 'ison_southern_women', is also available in {manynet}. Let's load and graph the data.

# let's load the data and analyze it
data("ison_southern_women")
ison_southern_women

graphr(ison_southern_women, node_color = "type")
graphr(ison_southern_women, "railway", node_color = "type")

data("ison_southern_women")
ison_southern_women
graphr(ison_southern_women, node_color = "type")

Project two-mode network into two one-mode networks

Now what if we are only interested in one part of the network? For that, we can obtain a 'projection' of the two-mode network. There are two ways of doing this. The hard way...

twomode_matrix <- as_matrix(ison_southern_women)
women_matrix <- twomode_matrix %*% t(twomode_matrix)
event_matrix <- t(twomode_matrix) %*% twomode_matrix

Or the easy way:

# women-graph
# to_mode1(): Results in a weighted one-mode object that retains the row nodes from
# a two-mode object, and weights the ties between them on the basis of their joint
# ties to nodes in the second mode (columns)

women_graph <- to_mode1(ison_southern_women)
graphr(women_graph)

# note that projection `to_mode1` involves keeping one type of nodes
# this is different from to_uniplex above, which keeps one type of ties in the network

# event-graph
# to_mode2(): Results in a weighted one-mode object that retains the column nodes from
# a two-mode object, and weights the ties between them on the basis of their joint ties
# to nodes in the first mode (rows)

event_graph <- to_mode2(ison_southern_women)
graphr(event_graph)

{manynet} also includes several other options for how to construct the projection. The default ("count") might be interpreted as indicating the degree or amount of shared ties to the other mode. "jaccard" divides this count by the number of nodes in the other mode to which either of the nodes are tied. It can thus be interpreted as opportunity weighted by participation. "rand" instead counts both shared ties and shared absences, and can thus be interpreted as the degree of behavioural mirroring between the nodes. Lastly, "pearson" (Pearson's coefficient) and "yule" (Yule's Q) produce correlations in ties for valued and binary data respectively.

to_mode2(ison_southern_women, similarity = "jaccard")
to_mode2(ison_southern_women, similarity = "rand")
to_mode2(ison_southern_women, similarity = "pearson")
to_mode2(ison_southern_women, similarity = "yule")

Let's return to the question of closure. First try one of the closure measures we have already treated that gives us a sense of shared partners for one-mode networks. Then compare this with net_equivalency(), which can be used on the original two-mode network.

# net_transitivity(): Calculate transitivity in a network

net_transitivity(women_graph)
net_transitivity(event_graph)

# net_equivalency(): Calculate equivalence or reinforcement in a (usually two-mode) network

net_equivalency(ison_southern_women)

net_transitivity(women_graph)
net_transitivity(event_graph)
net_equivalency(ison_southern_women)

question("What do we learn from this? Choose all that apply.",
  answer("Transitivity in the women projection is very high.", 
         correct = TRUE),
  answer("Transitivity in the event projection is very high.", 
         correct = TRUE),
  answer("Equivalence for the two-mode Southern Women dataset is moderate.",
         correct = TRUE,
         message = '<img src="https://i.giphy.com/media/v1.Y2lkPTc5MGI3NjExZmdqZHFraGR5Ym9udTBrYzA1ejhwd3ljOXdma3ppNWp2cXozMTh0bCZlcD12MV9pbnRlcm5hbF9naWZfYnlfaWQmY3Q9Zw/xqjZLKH1iTBe0/giphy.gif" alt="gif of you just wrinkled my brain"/>'),
  answer("Transitivity will often be very high in projected networks.", 
         correct = TRUE),
  answer("Projection obscures which women are members of which events and vice versa.", 
         correct = TRUE),
  # random_answer_order = TRUE,
  allow_retry = TRUE
)

Try to explain in no more than a paragraph why projection can lead to misleading transitivity measures and what some consequences of this might be.

Components

Now let's look at the friendship network, 'friends'. We're interested here in how many r gloss("components", "component") there are. By default, the net_components() function will return the number of strong components for directed networks. For weak components, you will need to first make the network r gloss("undirected"). Remember the difference between weak and strong components?

question("Weak components...",
  answer("don't care about tie direction when establishing components.", 
         correct = TRUE),
  answer("care about tie direction when establishing components."),
  allow_retry = TRUE
)

net_components(friends)
# note that friends is a directed network
# you can see this by calling the object 'friends'
# or by running `is_directed(friends)`

# Now let's look at the number of components for objects connected by an undirected edge
# Note: to_undirected() returns an object with all tie direction removed, 
# so any pair of nodes with at least one directed edge 
# will be connected by an undirected edge in the new network.
net_components(to_undirected(friends))

# note that friends is a directed network
net_components(friends)
net_components(to_undirected(friends))

question("How many components are there?",
  answer("2", 
         message = "There are more than 2 components."),
  answer("3", 
         message = "There are 3 _weak_ components.",
         correct = TRUE),
  answer("4", 
         message = 'There are 4 _strong_ components. <img src="https://i.giphy.com/media/v1.Y2lkPTc5MGI3NjExMDB5bzhjcGo3NHozZm5jdHcyMWk5emhnajNyZnFxc3Z2cmJnbGN0ZyZlcD12MV9pbnRlcm5hbF9naWZfYnlfaWQmY3Q9Zw/YpS0oFFC2VJ84/giphy.gif" alt="gif of two thumbs up"/>',
         correct = TRUE),
  answer("5", 
         message = "There are fewer than 5 components."),
  allow_retry = TRUE
)

So we know how many components there are, but maybe we're also interested in which nodes are members of which components? node_components() returns a membership vector that can be used to color nodes in graphr():

friends <- friends %>% 
  mutate(weak_comp = node_components(to_undirected(friends)),
         strong_comp = node_components(friends))
# node_components returns a vector of nodes' memberships to components in the network
# here, we are adding the nodes' membership to components as an attribute in the network
# alternatively, we can also use the function `add_node_attribute()`
# eg. `add_node_attribute(friends, "weak_comp", node_components(to_undirected(friends)))`

graphr(friends, node_color = "weak_comp") + ggtitle("Weak components") +
graphr(friends, node_color = "strong_comp") + ggtitle("Strong components")
# by using the 'node_color' argument, we are telling graphr to colour 
# the nodes in the graph according to the values of the 'weak_comp' attribute in the network

friends <- friends %>% 
  mutate(weak_comp = node_components(to_undirected(friends)),
         strong_comp = node_components(friends))
graphr(friends, node_color = "weak_comp") + ggtitle("Weak components") +
graphr(friends, node_color = "strong_comp") + ggtitle("Strong components")

question("Why is there a difference between the weak and strong components results?",
  answer("Because one node has only incoming ties.",
         correct = TRUE),
  answer("Because three nodes cannot reach any other nodes.",
         correct = TRUE),
  answer("Because there is an extra isolate."),
  answer("Because the tie strength matters."),
  random_answer_order = TRUE,
  allow_retry = TRUE
)

Factions

Components offer a precise way of understanding groups in a network. However, they can also ignore some 'groupiness' that is obvious to even a cursory examination of the graph. The irps_blogs network concerns the url links between political blogs in the 2004 election. It is a big network (you can check below). In our experience, it can take a few seconds

# This is a large network
net_nodes(irps_blogs)
# Let's concentrate on just a sample of 240 
# (by deleting a random sample of 1250)
blogs <- delete_nodes(irps_blogs, sample(1:1490, 1250))
graphr(blogs)

But are they all actually linked? Even among the smaller sample, there seems to be a number of isolates. We can calculate the number of isolates by simply summing node_is_isolate().

sum(node_is_isolate(blogs))

Since there are many isolates, there will be many components, even if we look at weak components and not just strong components.

net_components(blogs)
net_components(to_undirected(blogs))

Giant component

So, it looks like most of the (weak) components are due to isolates! How do we concentrate on the main component of this network? Well, the main/largest component in a network is called the r gloss("giant component", "giant").

blogs <- blogs %>% to_giant()
sum(node_is_isolate(blogs))
graphr(blogs)

Finally, we have a single 'giant' component to examine. However, now we have a different kind of challenge: everything is one big hairball. And yet, if we think about what we might expect of the structure of a network of political blogs, we might not think it is so undifferentiated. We might hypothesise that, despite the graphical presentation of a hairball, there is actually a reasonable partition of the network into two factions.

Finding a partition

To find a partition in a network, we use the node_in_partition() function. All node_in_*() functions return a string vector the length of the number of nodes in the network. It is a string vector because this is how a categorical result is obtained. We can assign the result of this function to the nodes in the network (because it is the length of the nodes in the network), and graph the network using this result.

blogs %>% mutate_nodes(part = node_in_partition()) %>% 
  graphr(node_color = "part")

We see from this graph that indeed there seems to be an obvious separation between the left and right 'hemispheres' of the network.

Modularity

But what is the 'fit' of this assignment of the blog nodes into two partitions? The most common measure of the fit of a community assignment in a network is modularity.

net_modularity(blogs, membership = node_in_partition(blogs))

Remember that modularity ranges between 1 and -1. How can we interpret this result?

While the partition algorithm is useful for deriving a partition of the network into the number of factions assigned, it is still an algorithm that tries to maximise modularity. Other times we might instead have an empirically collected grouping, and we are keen to see how 'modular' the network is around this attribute. This only works on categorical attributes, of course, but is otherwise quite flexible.

graphr(blogs, node_color = "Leaning")
net_modularity(blogs, membership = node_attribute(blogs, "Leaning"))

gif of Chevy Chase saying plot twist

How interesting. Perhaps the partitioning algorithm is not the algorithm that maximises modularity after all... Perhaps we need to look further and see whether there is another solution here that returns an even greater modularity criterion.

Communities

Ok, the friendship network has 3-4 components, but how many 'groups' are there? Visually, it looks like there are two denser clusters within the main component.

Today we'll use the 'friends' subgraph for exploring community detection methods. For clarity and simplicity, we will concentrate on the main component (the so-called 'giant' component) and consider friendship undirected.

# to_giant() returns an object that includes only the main component without any smaller components or isolates
(friends <- to_giant(friends))
(friends <- to_undirected(friends))
graphr(friends)

Comparing friends before and after these operations, you'll notice the number of ties decreases as reciprocated directed ties are consolidated into single undirected ties, and the number of nodes decreases as two isolates are removed.

There is no one single best community detection algorithm. Instead there are several, each with their strengths and weaknesses. Since this is a rather small network, we'll focus on the following methods: walktrap, edge betweenness, and fast greedy. (Others are included in {manynet}/{igraph}) As you use them, consider how they portray communities and consider which one(s) afford a sensible view of the social world as cohesively organized.

Walktrap

This algorithm detects communities through a series of short random walks, with the idea that nodes encountered on any given random walk are more likely to be within a community than not. It was proposed by Pons and Latapy (2005).

The algorithm initially treats all nodes as communities of their own, then merges them into larger communities, still larger communities, and so on. In each step a new community is created from two other communities, and its ID will be one larger than the largest community ID so far. This means that before the first merge we have n communities (the number of vertices in the graph) numbered from zero to n-1. The first merge creates community n, the second community n+1, etc. This merge history is returned by the function: # ?igraph::cluster_walktrap

Note the "steps=" argument that specifies the length of the random walks. While {igraph} sets this to 4 by default, which is what is recommended by Pons and Latapy, Waugh et al (2009) found that for many groups (Congresses), these lengths did not provide the maximum modularity score. To be thorough in their attempts to optimize modularity, they ran the walktrap algorithm 50 times for each group (using random walks of lengths 1–50) and selected the network partition with the highest modularity value from those 50. They call this the "maximum modularity partition" and insert the parenthetical "(though, strictly speaking, this cannot be proven to be the optimum without computationally-prohibitive exhaustive enumeration (Brandes et al. 2008))."

So let's try and get a community classification using the walktrap algorithm, node_in_walktrap(), with path lengths of the random walks specified to be 50.

friend_wt <- node_in_walktrap(friends, times=50)

friend_wt # note that it prints pretty, but underlying its just a vector:
c(friend_wt)

# This says that dividing the graph into 2 communities maximises modularity,
# one with the nodes 
which(friend_wt == 1)
# and the other 
which(friend_wt == 2)

# resulting in a modularity of 
net_modularity(friends, friend_wt)

friend_wt <- node_in_walktrap(friends, times=50)
# results in a modularity of 
net_modularity(friends, friend_wt)

We can also visualise the clusters on the original network How does the following look? Plausible?

# plot 1: groups by node color

friends <- friends %>% 
  mutate(walk_comm = friend_wt)
graphr(friends, node_color = "walk_comm")

#plot 2: groups by borders

# to be fancy, we could even draw the group borders around the nodes using the node_group argument
graphr(friends, node_group = "walk_comm")

# plot 3: group and node colors

# or both!
graphr(friends,
       node_color = "walk_comm",
       node_group = "walk_comm") +
  ggtitle("Walktrap",
    subtitle = round(net_modularity(friends, friend_wt), 3))
# the function `round()` rounds the values to a specified number of decimal places
# here, we are telling it to round the net_modularity score to 3 decimal places,
# but the score is exactly 0.27 so only two decimal places are printed.

friends <- friends %>% 
  mutate(walk_comm = friend_wt)
graphr(friends, node_color = "walk_comm")
# to be fancy, we could even draw the group borders around the nodes using the node_group argument
graphr(friends, node_group = "walk_comm")
# or both!
graphr(friends,
       node_color = "walk_comm",
       node_group = "walk_comm") +
  ggtitle("Walktrap",
    subtitle = round(net_modularity(friends, friend_wt), 3))

This can be helpful when polygons overlap to better identify membership Or you can use node color and size to indicate other attributes...

Edge Betweenness

Edge betweenness is like betweenness centrality but for ties not nodes. The edge-betweenness score of an edge measures the number of shortest paths from one vertex to another that go through it.

The idea of the edge-betweenness based community structure detection is that it is likely that edges connecting separate clusters have high edge-betweenness, as all the shortest paths from one cluster to another must traverse through them. So if we iteratively remove the edge with the highest edge-betweenness score we will get a hierarchical map (dendrogram) of the communities in the graph.

The following works similarly to walktrap, but no need to set a step length.

friend_eb <- node_in_betweenness(friends)
friend_eb

How does community membership differ here from that found by walktrap?

We can see how the edge betweenness community detection method works here: http://jfaganuk.github.io/2015/01/24/basic-network-analysis/

To visualise the result:

# create an object

friends <- friends %>% 
  mutate(eb_comm = friend_eb)

# create a graph with a title and subtitle returning the modularity score

graphr(friends,
       node_color = "eb_comm",
       node_group = "eb_comm") +
  ggtitle("Edge-betweenness",
    subtitle = round(net_modularity(friends, friend_eb), 3))

friends <- friends %>% 
  mutate(eb_comm = friend_eb)
graphr(friends,
       node_color = "eb_comm",
       node_group = "eb_comm") +
  ggtitle("Edge-betweenness",
    subtitle = round(net_modularity(friends, friend_eb), 3))

For more on this algorithm, see M Newman and M Girvan: Finding and evaluating community structure in networks, Physical Review E 69, 026113 (2004), https://arxiv.org/abs/cond-mat/0308217.

Fast Greedy

This algorithm is the Clauset-Newman-Moore algorithm. Whereas edge betweenness was divisive (top-down), the fast greedy algorithm is agglomerative (bottom-up).

At each step, the algorithm seeks a merge that would most increase modularity. This is very fast, but has the disadvantage of being a greedy algorithm, so it might not produce the best overall community partitioning, although I personally find it both useful and in many cases quite "accurate".

friend_fg <- node_in_greedy(friends)
friend_fg # Does this result in a different community partition?
net_modularity(friends, friend_fg) # Compare this to the edge betweenness procedure

# Again, we can visualise these communities in different ways:
friends <- friends %>% 
  mutate(fg_comm = friend_fg)
graphr(friends,
       node_color = "fg_comm",
       node_group = "fg_comm") +
  ggtitle("Fast-greedy",
    subtitle = round(net_modularity(friends, friend_fg), 3))
#

friend_fg <- node_in_greedy(friends)
friend_fg # Does this result in a different community partition?
net_modularity(friends, friend_fg) # Compare this to the edge betweenness procedure

# Again, we can visualise these communities in different ways:
friends <- friends %>% 
  mutate(fg_comm = friend_fg)
graphr(friends,
       node_color = "fg_comm",
       node_group = "fg_comm") +
  ggtitle("Fast-greedy",
    subtitle = round(net_modularity(friends, friend_fg), 3))

See A Clauset, MEJ Newman, C Moore: Finding community structure in very large networks, https://arxiv.org/abs/cond-mat/0408187

question("What is the difference between communities and components?",
         answer("Communities and components are just different terms for the same thing"),
         answer("Communities are a stricter form of component"),
         answer("Components are about paths whereas communities are about the relationship between within-group and between-group ties",
                correct = TRUE),
  random_answer_order = TRUE,
         allow_retry = TRUE)

Communities

Lastly, {manynet} includes a function to run through and find the membership assignment that maximises modularity across any of the applicable community detection procedures. This is helpfully called node_in_community(). This function pays attention to the size and other salient network properties, such as whether it is directed, weighted, or connected. It either uses the algorithm that maximises modularity by default (node_in_optimal()) for smaller networks, or surveys the approximations provided by other algorithms for the result that maximises the modularity.

node_in_community(friends)

It is important to name which algorithm produced the membership assignment that is then subsequently analysed though.

question("Which algorithm provides the 'best' membership assignment here?",
         answer("Walktrap"),
         answer("Edge betweenness"),
         answer("Leiden"),
         answer("Louvain"),
         answer("Infomap"),
         answer("Fluid"),
         answer("Fast Greedy"),
         answer("Eigenvector"),
         answer("Spinglass"),
         answer("Partition"),
         answer("Optimal",
                correct = TRUE),
  random_answer_order = TRUE,
         allow_retry = TRUE)

Free play

gif of two dancing

We've looked here at the irps_blogs dataset. Now have a go at the irps_books dataset. What is the density? Does it make sense to investigate reciprocity, transitivity, or equivalence? How can we interpret the results? How many components in the network? Is there a strong factional structure? Which community detection algorithm returns the highest modularity score, or corresponds best to what is in the data or what you see in the graph?

irps_books

Glossary

gif of annie raising her hand

Here are some of the terms that we have covered in this module:

r print_glossary()

Any scripts or data that you put into this service are public.

manynet documentation built on June 23, 2025, 9:07 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

manynet
Many Ways to Make, Modify, Map, Mark, and Measure Myriad Networks

Cohesion and Community
In manynet: Many Ways to Make, Modify, Map, Mark, and Measure Myriad Networks

Setting up

Adding names

Separating multiplex networks

Cohesion

Density

Closure

Reciprocity

Transitivity

Projection

A two-mode network

Project two-mode network into two one-mode networks

Components

Factions

Giant component

Finding a partition

Modularity

Communities

Walktrap

Edge Betweenness

Fast Greedy

Communities

Free play

Glossary

Try the manynet package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

manynet Many Ways to Make, Modify, Map, Mark, and Measure Myriad Networks

Cohesion and Community In manynet: Many Ways to Make, Modify, Map, Mark, and Measure Myriad Networks

Setting up

Adding names

Separating multiplex networks

Cohesion

Density

Closure

Reciprocity

Transitivity

Projection

A two-mode network

Project two-mode network into two one-mode networks

Components

Factions

Giant component

Finding a partition

Modularity

Communities

Walktrap

Edge Betweenness

Fast Greedy

Communities

Free play

Glossary

Try the manynet package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

manynet
Many Ways to Make, Modify, Map, Mark, and Measure Myriad Networks

Cohesion and Community
In manynet: Many Ways to Make, Modify, Map, Mark, and Measure Myriad Networks