# greedy: Greedy algorithms In modMax: Community Structure Detection via Modularity Maximization

## Description

`greedy` executes the general CNM algorithm and its modifications for modularity maximization.

`rgplus` uses the randomized greedy approach to identify core groups (vertices which are always placed into the same community) and uses these core groups as initial partition for the randomized greedy approach to identify the community structure and maximize the modularity.

`msgvm` is a greedy algorithm which performs more than one merge at one step and applies fast greedy refinement at the end of the algorithm to improve the modularity value.

`cd` iteratively performs complete greedy refinement on a certain partition and then, moves vertices with a probability p to another community to avoid the greedy algorithm getting trapped in a local optimum.

`louvain` performs fast greedy refinement and uses the resulting community structure to build a new network where vertices in the new network are the communities in the original network. For this new network, all vertices are assigned to their own community, and the fast greedy refinement is applied again.

`vertexSim` uses a vertex similarity measure to identify the initial partition and further improves this community structure by merging neighbouring communities.

`mome` consists of the two phases of coarsening and uncoarsening with refinement. In the coarsening phase, two vertices are collapsed into one vertex for which the increase in modularity is maximal. In the uncoarsening phase, each intermediate graph of the coarsening phase is revisited and its community structure is refined by applying fast greedy refinement. After revisiting the different steps, the community structure for the original graph can be reconstructed from different coarsening levels.

## Usage

 ``` 1 2 3 4 5 6 7 8 9 10 11 12``` ```greedy(adjacency, numRandom = 0, q = c("general", "danon", "wakita1", "wakita2", "wakita3"), initial = c("general", "prior", "walkers", "subgraph", "adclust", "own"), randomized = 0, refine = c("none", "complete", "fast", "kernighan"), coarse = 0) rgplus(adjacency,numRandom=0,z,randomized) msgvm(adjacency,numRandom=0,initial=c("general","own"), parL) cd(adjacency, numRandom=0,initial=c("general","own"),maxC=length(adjacency[,1]), iter,p) louvain(adjacency, numRandom=0, initial=c("general","own")) vertexSim(adjacency, numRandom=0, frac=0.5) mome(adjacency, numRandom=0) ```

## Arguments

 `adjacency` A nonnegative symmetric adjacency matrix of the network whose community structur will be analyzed `numRandom` The number of random networks with which the modularity of the resulting community structure should be compared (default: no comparison). see details below for further explanation of the used null model. `q` Specify whether the general ΔQ value or a modification should be used. See details below. `initial` Specify the community structure to be used as initial partition in the algorithm. See details below. `z` The number of executions of the randomized greedy approach to identify the core groups. `randomized` The number of rows to use for the randomized greedy approach. Ignored when set to `0` (default) `refine` specifies which refinement algorithm should be used. See details below. `coarse` Define the percentage by which the number of communities has to be decreased since the last coarsening level to consider the current clustering as a new coarsening level and apply refinement on this clustering `parL` The number of merges at one step in the `msgvm` algorithm `maxC` The maximum number of communities for the initial partition used in the `cd` algorithm `iter` The number of iterations in the `cd` algorithm `p` The probability with which a vertex is moved into another community in the dilation step of the `cd` algorithm `frac` The fraction of iteration steps for which "pairwise" merging is performed in the `vertexSim` algorithm. Remaining iteration steps are "single neighbour" merges.

## Details

The used random networks have the same number of vertices and the same degree distribution as the original network.

For the identification of the best merging event leading to a maximum increase in modularity, different values of the modularity were proposed. Which modularity value to use is specified by the parameter `q`. The options are `general` where the normal value for ΔQ is used, `danon` where ΔQ is normalized by the number of overall edges of vertices in a community and `wakita1`, `wakita2` and `wakita3` where ΔQ is multiplied by the consolidation ratio.

The greedy algorithms can be run on different initial partitions. The used initial partition is specified by parameter `initial`. The options are `general` where all vertices are assigned to their own community, `prior` where the initial community structure is identified by using prior knowledge, `walkers` where the initial community structure is identified by using random walkers, `subgraph` where the initial community structure is identified by using subgraph similarity, `adclust` where the general initial partition is refined using fast greedy refinement and `own` where the user can specify an initial partition to use with the greedy approach. In this case, the user needs to add a last column to the adjacency matrix indicating the initial partition. Hence, the adjacency matrix has to have one column more than the network has vertices.

The community structure identified by the CNM algorithm can be refined by applying a refinement step at the end of the algorithm. The used refinement algorithm is specified by the parameter `refine`. The options are `none` where no refinement algorithm is applied, `complete` where the complete greedy refinement is applied, `fast` where the fast greedy refinement is applied, `kernighan` where the adapted Kernighan-Lin refinement is applied. Besides, if `initial` is set to `adclust`, fast greedy refinement is applied to the community structure after each merging event. If `coarse != 0`, the refinement algorithm specified by `refine` is not only applied at the end of the algorithm, but at each coarsening level where coarsening levels are defined according to `coarse`.

## Value

The result of the greedy algorithms is a list with the following components

 `number of communities` The number of communities detected by the algorithm `modularity` The modularity of the detected community structure `mean` The mean of the modularity values for random networks, only computed if `numRandom>0` `standard deviation` The standard deviation of the modularity values for random networks, only computed if `numRandom>0` `community structure` The community structure of the examined network given by a vector assigning each vertex its community number `random modularity values` The list of the modularity values for random networks, only computed if `numRandom>0`

## Author(s)

Maria Schelling, Cang Hui

## References

Clauset, A., Newman, M. and Moore, C. Finding community structure in very large networks. Phys. Rev. E, 70:066111, Dec 2004.

Danon, L., Daz-Guilera, A. and Arenas, A. The effect of size heterogeneity on community identifcation in complex networks. Journal of Statistical Mechanics: Theory and Experiment, 2006(11):P11010, 2006.

Wakita, K. and Tsurumi, T. Finding community structure in mega-scale social networks: [extended abstract]. In Proceedings of the 16th International Conference on World Wide Web, WWW '07, pages 1275- 1276, New York, NY, USA, 2007. ACM.

Ovelgonne, M. and Geyer-Schulz, A. Cluster cores and modularity maximization. In Data Mining Workshops (ICDMW), 2010 IEEE International Conference on, pages 1204-1213, Dec 2010.

Du, H., Feldman, M. W., Li, S. and Jin, X. An algorithm for detecting community structure of social networks based on prior knowledge and modularity. Complexity, 12(3):53-60, 2007.

Pujol, J., Bejar, J. and Delgado, J. Clustering algorithm for determining community structure in large networks. Phys. Rev. E, 74:016107, Jul 2006.

Xiang, B., Chen, E.-H. and Zhou, T. Finding community structure based on subgraph similarity. In Santo Fortunato, Giuseppe Mangioni, Ronaldo Menezes, and Vincenzo Nicosia, editors, Complex Networks, volume 207 of Studies in Computational Intelligence, pages 73-81. Springer Berlin Heidelberg, 2009.

Noack, A. and Rotta, R. Multi-level algorithms for modularity clustering. Technical report, 2008.

Ye, Z., Hu, S. and Yu, J. Adaptive clustering algorithm for community detection in complex networks. Phys. Rev. E, 78:046115, Oct 2008.

Schuetz, P. and Caflisch, A. Efficient modularity optimization by multistep greedy algorithm and vertex mover refinement. Phys. Rev. E, 77:046112, Apr 2008.

Mei, J., He, S., Shi, G., Wang, Z., and Li, W. Revealing network communities through modularity maximization by a contractiondilation method. New Journal of Physics, 11(4):043025, 2009.

Blondel, V. D., Guillaume. J.-L., Lambiotte, R. and Lefebvre, E. Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment, 2008(10):P10008, 2008.

Arab, M. and Afsharchi, M. A modularity maximization algorithm for community detection in social networks with low time complexity. In Web Intelligence and Intelligent Agent Technology (WI-IAT), 2012 IEEE/WIC/ACM International Conferences on, volume 1, pages 480-487, Dec 2012.

Zhu, Z., Wang, C., Ma, L., Pan, Y. and Ding, Z. Scalable community discovery of large networks. In Web-Age Information Management, 2008. WAIM '08. The Ninth International Conference on, pages 381-388, July 2008.

## Examples

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20``` ```#unweighted network randomgraph1 <- erdos.renyi.game(10, 0.3, type="gnp",directed = FALSE, loops = FALSE) #to ensure that the graph is connected vertices1 <- which(clusters(randomgraph1)\$membership==1) graph1 <- induced.subgraph(randomgraph1,vertices1) adj1 <- get.adjacency(graph1) result1 <- greedy(adj1, refine = "fast") #weighted network randomgraph2 <- erdos.renyi.game(10, 0.3, type="gnp",directed = FALSE, loops = FALSE) #to ensure that the graph is connected vertices2 <- which(clusters(randomgraph2)\$membership==1) graph2 <- induced.subgraph(randomgraph2,vertices2) graph2 <- set.edge.attribute(graph2, "weight", value=runif(ecount(graph2),0,1)) adj2 <- get.adjacency(graph2, attr="weight") result2 <- louvain(adj2) ```

### Example output

```Loading required package: gtools

Attaching package: 'igraph'

The following object is masked from 'package:gtools':

permute

The following objects are masked from 'package:stats':

decompose, spectrum

The following object is masked from 'package:base':

union
```

modMax documentation built on May 2, 2019, 12:20 p.m.