community: Community partitioning algorithms

communityR Documentation

Community partitioning algorithms

Description

These functions offer different algorithms useful for partitioning networks into sets of communities. The different algorithms offer various advantages in terms of computation time, availability on different types of networks, ability to maximise modularity, and their logic or domain of inspiration.

Usage

node_optimal(.data)

node_kernighanlin(.data)

node_edge_betweenness(.data)

node_fast_greedy(.data)

node_leading_eigen(.data)

node_walktrap(.data, times = 50)

node_infomap(.data, times = 50)

node_spinglass(.data, max_k = 200, resolution = 1)

node_louvain(.data, resolution = 1)

node_leiden(.data, resolution = 1)

Arguments

.data

An object of a {manynet}-consistent class:

  • matrix (adjacency or incidence) from {base} R

  • edgelist, a data frame from {base} R or tibble from {tibble}

  • igraph, from the {igraph} package

  • network, from the {network} package

  • tbl_graph, from the {tidygraph} package

times

Integer indicating number of simulations/walks used. By default, times=50.

max_k

Integer constant, the number of spins to use as an upper limit of communities to be found. Some sets can be empty at the end.

resolution

The Reichardt-Bornholdt “gamma” resolution parameter for modularity. By default 1, making existing and non-existing ties equally important. Smaller values make existing ties more important, and larger values make missing ties more important.

Functions

  • node_optimal(): A problem-solving algorithm that seeks to maximise modularity over all possible partitions.

  • node_kernighanlin(): A greedy, iterative, deterministic partitioning algorithm that results in a graph with two equally-sized communities

  • node_edge_betweenness(): A hierarchical, decomposition algorithm where edges are removed in decreasing order of the number of shortest paths passing through the edge, resulting in a hierarchical representation of group membership.

  • node_fast_greedy(): A hierarchical, agglomerative algorithm, that tries to optimize modularity in a greedy manner.

  • node_leading_eigen(): A top-down, hierarchical algorithm.

  • node_walktrap(): A hierarchical, agglomerative algorithm based on random walks.

  • node_infomap(): A hierarchical algorithm based on the information in random walks.

  • node_spinglass(): A greedy, iterative, probabilistic algorithm, based on analogy to model from statistical physics.

  • node_louvain(): An agglomerative multilevel algorithm that seeks to maximise modularity over all possible partitions.

  • node_leiden(): An agglomerative multilevel algorithm that seeks to maximise the Constant Potts Model over all possible partitions.

Optimal

The general idea is to calculate the modularity of all possible partitions, and choose the community structure that maximises this modularity measure. Note that this is an NP-complete problem with exponential time complexity. The guidance in the igraph package is networks of <50-200 nodes is probably fine.

Edge-betweenness

This is motivated by the idea that edges connecting different groups are more likely to lie on multiple shortest paths when they are the only option to go from one group to another. This method yields good results but is very slow because of the computational complexity of edge-betweenness calculations and the betweenness scores have to be re-calculated after every edge removal. Networks of ~700 nodes and ~3500 ties are around the upper size limit that are feasible with this approach.

Fast-greedy

Initially, each node is assigned a separate community. Communities are then merged iteratively such that each merge yields the largest increase in the current value of modularity, until no further increases to the modularity are possible. The method is fast and recommended as a first approximation because it has no parameters to tune. However, it is known to suffer from a resolution limit.

Leading eigenvector

In each step, the network is bifurcated such that modularity increases most. The splits are determined according to the leading eigenvector of the modularity matrix. A stopping condition prevents tightly connected groups from being split further. Note that due to the eigenvector calculations involved, this algorithm will perform poorly on degenerate networks, but will likely obtain a higher modularity than fast-greedy (at some cost of speed).

Walktrap

The general idea is that random walks on a network are more likely to stay within the same community because few edges lead outside a community. By repeating random walks of 4 steps many times, information about the hierarchical merging of communities is collected.

Infomap

Motivated by information theoretic principles, this algorithm tries to build a grouping that provides the shortest description length for a random walk, where the description length is measured by the expected number of bits per node required to encode the path.

Spin-glass

This is motivated by analogy to the Potts model in statistical physics. Each node can be in one of k "spin states", and ties (particle interactions) provide information about which pairs of nodes want similar or different spin states. The final community definitions are represented by the nodes' spin states after a number of updates. A different implementation than the default is used in the case of signed networks, such that nodes connected by negative ties will be more likely found in separate communities.

Louvain

The general idea is to take a hierarchical approach to optimising the modularity criterion. Nodes begin in their own communities and are re-assigned in a local, greedy way: each node is moved to the community where it achieves the highest contribution to modularity. When no further modularity-increasing reassignments are possible, the resulting communities are considered nodes (like a reduced graph), and the process continues.

Leiden

The general idea is to optimise the Constant Potts Model, which does not suffer from the resolution limit, instead of modularity. As outlined in the {igraph} package, the Constant Potts Model object function is:

\frac{1}{2m} \sum_{ij}(A_{ij}-\gamma n_i n_j)\delta(\sigma_i, \sigma_j)

where m is the total tie weight, A_{ij} is the tie weight between i and j, \gamma is the so-called resolution parameter, n_i is the node weight of node i, and \delta(\sigma_i, \sigma_j) = 1 if and only if i and j are in the same communities and 0 otherwise.

References

Brandes, Ulrik, Daniel Delling, Marco Gaertler, Robert Gorke, Martin Hoefer, Zoran Nikoloski, Dorothea Wagner. 2008. "On Modularity Clustering", IEEE Transactions on Knowledge and Data Engineering 20(2):172-188.

Kernighan, Brian W., and Shen Lin. 1970. "An efficient heuristic procedure for partitioning graphs." The Bell System Technical Journal 49(2): 291-307. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1002/j.1538-7305.1970.tb01770.x")}

Newman, M, and M Girvan. 2004. "Finding and evaluating community structure in networks." Physical Review E 69: 026113.

Clauset, A, MEJ Newman, MEJ and C Moore. "Finding community structure in very large networks."

Newman, MEJ. 2006. "Finding community structure using the eigenvectors of matrices" Physical Review E 74:036104.

Pons, Pascal, and Matthieu Latapy "Computing communities in large networks using random walks".

Rosvall, M, and C. T. Bergstrom. 2008. "Maps of information flow reveal community structure in complex networks", PNAS 105:1118. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1073/pnas.0706851105")}

Rosvall, M., D. Axelsson, and C. T. Bergstrom. 2009. "The map equation", Eur. Phys. J. Special Topics 178: 13. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1140/epjst/e2010-01179-1")}

Reichardt, Jorg, and Stefan Bornholdt. 2006. "Statistical Mechanics of Community Detection" Physical Review E, 74(1): 016110–14. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1073/pnas.0605965104")}

Traag, VA, and Jeroen Bruggeman. 2008. "Community detection in networks with positive and negative links".

Blondel, Vincent, Jean-Loup Guillaume, Renaud Lambiotte, Etienne Lefebvre. 2008. "Fast unfolding of communities in large networks", J. Stat. Mech. P10008.

Traag, V. A., L Waltman, and NJ van Eck. 2019. "From Louvain to Leiden: guaranteeing well-connected communities", Scientific Reports, 9(1):5233. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1038/s41598-019-41695-z")}

See Also

Other memberships: components(), core, equivalence

Examples

node_optimal(ison_adolescents)
node_kernighanlin(ison_adolescents)
node_kernighanlin(ison_southern_women)
node_edge_betweenness(ison_adolescents)
node_fast_greedy(ison_adolescents)
node_leading_eigen(ison_adolescents)
node_walktrap(ison_adolescents)
node_infomap(ison_adolescents)
node_spinglass(ison_adolescents)
node_louvain(ison_adolescents)
node_leiden(ison_adolescents)

migraph documentation built on Nov. 2, 2023, 5:47 p.m.