netemd_one_to_one: NetEMD Network Earth Mover's Distance between a pair of...

View source: R/measures_net_emd.R

netemd_one_to_oneR Documentation

NetEMD Network Earth Mover's Distance between a pair of networks.

Description

Calculates the network Earth Mover's Distance (EMD) between two sets of network features. This is done by individually normalising the distribution of each feature so that they have unit mass and unit variance. Then the minimun EMD between the same pair of features (one for each corresponding graph) is calculated by considering all possible translations of the feature distributions. Finally the average over all features is reported. This is calculated as follows: 1. Normalise each feature histogram to have unit mass and unit variance. 2. For each feature, find the minimum EMD between each pair of histograms considering all possible histogram translations. 3. Take the average minimum EMD across all features.

Usage

netemd_one_to_one(
  graph_1 = NULL,
  graph_2 = NULL,
  dhists_1 = NULL,
  dhists_2 = NULL,
  method = "optimise",
  return_details = FALSE,
  smoothing_window_width = 0,
  feature_type = "orbit",
  max_graphlet_size = 5,
  ego_neighbourhood_size = 0
)

Arguments

graph_1

A network/graph object from the igraph package. graph_1 can be set to NULL (default) if dhists_1 is provided.

graph_2

A network/graph object from the igraph package. graph_2 can be set to NULL (default) if dhists_2 is provided.

dhists_1

Either, a dhist discrete histogram object, or list of such objects, or a matrix of network features (each column representing a feature). dhists_1 can be set to NULL (default) if graph_1 is provided. A dhist object can be obtained from graph_features_to_histograms.

dhists_2

Same as dhists_1.

method

The method to be used to find the minimum EMD across all potential offsets for each pair of histograms. Default is "optimise" to use R's built-in stats::optimise method to efficiently find the offset with the minimal EMD. However, this is not guaranteed to find the global minimum if multiple local minima EMDs exist. You can alternatively specify the "exhaustive" method, which will exhaustively evaluate the EMD between the histograms at all offsets that are candidates for the minimal EMD at the cost of computational time.

return_details

Logical indicating whether to return the individual minimal EMDs and associated offsets for all pairs of histograms.

smoothing_window_width

Width of "top-hat" smoothing window to apply to "smear" point masses across a finite width in the real domain. Default is 0, which results in no smoothing. Care should be taken to select a smoothing_window_width that is appropriate for the discrete domain (e.g.for the integer domain a width of 1 is the natural choice).

feature_type

Type of graphlet-based feature to count: "graphlet" counts the number of graphlets each node participates in; "orbit" (default) calculates the number of graphlet orbits each node participates in.

max_graphlet_size

Determines the maximum size of graphlets to count. Only graphlets containing up to max_graphlet_size nodes will be counted. Possible values are 4, and 5 (default).

ego_neighbourhood_size

The number of steps from the source node to include nodes for each ego-network. NetEmd was proposed for individual nodes alone, hence the default value is 0.

Value

NetEMD measure for the two sets of discrete histograms (or graphs). If (return_details = FALSE) then a list with the following named elements is returned net_emd: the NetEMD for the set of histogram pairs (or graphs), min_emds: the minimal EMD for each pair of histograms, min_offsets: the associated offsets giving the minimal EMD for each pair of histograms

Examples

 require(igraph)
 graph_1 <- graph.lattice(c(8,8)) 
 graph_2 <- graph.lattice(c(44,44)) 
 netemd_one_to_one(graph_1=graph_1,graph_2=graph_2,feature_type="orbit",max_graphlet_size=5)
 
 #Providing a matrix of network features
 props_a= count_orbits_per_node(graph = graph_1,max_graphlet_size = 5)
 props_b= count_orbits_per_node(graph = graph_2,max_graphlet_size = 5)
 
 netemd_one_to_one(dhists_1=props_a, dhists_2=props_b,smoothing_window_width = 1)
 
 #Providing the network features as lists of dhist objects
 dhists_1<- graph_features_to_histograms(props_a)
 dhists_2<- graph_features_to_histograms(props_b)
 
 netemd_one_to_one(dhists_1=dhists_1, dhists_2=dhists_2)
 
 
 # A variation of NetEmd: Using the Laplacian spectrum 
 #Laplacian
 Lapg_1 <- igraph::laplacian_matrix(graph = graph_1,normalized = FALSE,sparse = FALSE)
 Lapg_2 <- igraph::laplacian_matrix(graph = graph_2,normalized = FALSE,sparse = FALSE)
 
 #Normalized Laplacian
 NLapg_1 <- igraph::laplacian_matrix(graph = graph_1,normalized = TRUE,sparse = FALSE)
 NLapg_2 <- igraph::laplacian_matrix(graph = graph_2,normalized = TRUE,sparse = FALSE)
 
 #Spectra (This may take a couple of minutes).
 props_1 <- cbind(L.Spectra= eigen(Lapg_1)$values, NL.Spectra= eigen(NLapg_1)$values) 
 props_2 <- cbind(L.Spectra= eigen(Lapg_2)$values, NL.Spectra= eigen(NLapg_2)$values) 
 
 netemd_one_to_one(dhists_1 = props_1,dhists_2 = props_2,smoothing_window_width = 0)#Use of smoothing window 1 is given for discrete integer distributions. If the network features are considered continuous variables smoothing_window_width equal to zero is recommended.
 

alan-turing-institute/network-comparison documentation built on June 7, 2022, 10:41 p.m.