add_clusters | R Documentation |
This function takes as input a tibble graph (from tidygraph) or a list of tibble graphs, and then runs different cluster detection algorithms depending on the method chosen by the user (see @details for information on the different methods. The function associate each node to its corresponding cluster identifier. It also creates a cluster attribute for edges: to each edge is associated a corresponding cluster identifier if the two nodes connected by the edge belong to the same cluster If nodes have a different cluster, the edge takes "00" as cluster attribute.
add_clusters( graphs, weights = NULL, clustering_method = c("leiden", "louvain", "fast_greedy", "infomap", "walktrap"), objective_function = c("modularity", "CPM"), resolution = 1, n_iterations = 1000, n_groups = NULL, node_weights = NULL, trials = 10, steps = 4, verbose = TRUE, seed = NA )
graphs |
A tibble graph from tidygraph, a list of tibble graphs or a data frame. |
weights |
The weights of the edges. It must be a positive numeric vector, |
clustering_method |
The different clustering algorithms implemented in the function (see details). The parameters of the function depend of the clustering method chosen. |
objective_function |
The objective function to maximize for the leiden algorithm.
Whether to use the Constant Potts Model (CPM) or modularity. Must be either "CPM"
or "modularity" (see |
resolution |
The resolution parameter to use for leiden algorithm
(see |
n_iterations |
the number of iterations to iterate the Leiden algorithm.
Each iteration may improve the partition further (see |
n_groups |
May be used by the fast greedy or the walktrap algorithm. Integer scalar, the desired number of communities. If too low or two high, then an error message is given. |
node_weights |
May be used both for the Leiden or infomap algorithms.
For Leiden, if this is not provided, it will be automatically determined on the
basis of the objective_function (see |
trials |
The number of attempts to partition the network
(can be any integer value equal or larger than 1) for the infomap algorithm
(see |
steps |
The length of the random walks to perform for the walktrap algorithm
(see |
verbose |
Set to |
seed |
Enter a random number to set the seed within the function. Some algorithms use heuristics and random processes that might result in different cluster each time the function is run. Setting the seed is particularly useful for reproducibility and if you want to make sure to find the same clusters each time the function is run with the same graphs. |
The function could be run indifferently on one tidigraph object or on a list
of tidygraph object, as created by build_dynamic_networks()
.
The function implements five different algorithms. Four exists in
igraph and are used in this package through their implement
in tidygraph (see
group_graph()). The function also implements the
Leiden algorithm \insertCitetraag2019networkflow which is in igraph
but not
in tidygraph
yet (see cluster_leiden()).
The newly created columns with the cluster identifier for nodes and edges
are named depending of the method used. If you use the Leiden algorithm, the
function will create a column called cluster_leiden
for nodes, and three columns
for the edges, called cluster_leiden_from
, cluster_leiden_to
and cluster_leiden
.
The function also
automatically calculates the percentage of total nodes that are gathered in each
cluster, in the column size_com
.
To make plotting easier later, a zero is put before one-digit cluster identifier (cluster 5 becomes "05"; cluster 10 becomes "10"). Attributing a cluster identifier to edges allow for giving edges the same color of the nodes they are connecting together if the two nodes have the same color, or a different color from both nodes, if the nodes belong to different clusters.
The same tidygraph graph or tidygraph list as input, but with a new cluster column for nodes with a column with the size of these clusters, and three cluster columns for edges (see the details).
library(networkflow) nodes <- Nodes_stagflation |> dplyr::rename(ID_Art = ItemID_Ref) |> dplyr::filter(Type == "Stagflation") references <- Ref_stagflation |> dplyr::rename(ID_Art = Citing_ItemID_Ref) temporal_networks <- build_dynamic_networks(nodes = nodes, directed_edges = references, source_id = "ID_Art", target_id = "ItemID_Ref", time_variable = "Year", cooccurrence_method = "coupling_similarity", time_window = 20, edges_threshold = 1, overlapping_window = TRUE, filter_components = TRUE) temporal_networks <- add_clusters(temporal_networks, objective_function = "modularity", clustering_method = "leiden") temporal_networks[[1]]
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.