require(graphstats)
require(igraph)
require(ggplot2)

Background

Given two networks that we believe to have some correspondence, we may want to match the nodes of one network to those of the other; that is, we want to find the bijection between the vertex sets that minimizes the number of edge disagreements across the graph. An edge disagreement occurs when two vertices are adjacent on in one graph but their corresponding vertices in the other are not.

Example 1: A Known Correspondence

Here, we produce a a simple random graph (adjacency matrix) A with n = 10 and p = 0.5. We then produce B by maintaining the first m = 3 rows (the seeds), and permuting the remaining n - m rows. The sgm function can be called in two ways.

# Number of seeds, total nubmer of vertices, and edge probability.
m <- 3
n <- 10
p <- 0.5
set.seed(12345)

# Sample matrix A, and permute to .
A <- matrix(as_adj(sample_sbm(n, p, n)), nrow = n)
B <- A[c(1:m, sample(n-m)+m),]

# P is the permutation matrix that will turn match B to A.

# Unordered call.
seeds <- matrix(cbind(1:m, 1:m), nrow = m)
P <- sgm(A, B, seeds)

# Ordered call.
start <- diag(n-m)[sample(n-m),]
P <- sgm.ordered(A, B, m, start)

# Display.
gs.plot.plot_matrix(P, title="Permutation Matrix", legend.name="Entries", ffactor = TRUE)

Here, we have the first m nodes along the diagonal, unpermuted. (NOTE: The matrix heatmaps in this demo have the diagonal ranging from the bottom left to the rop right.) By applying the returned matrix to B, we can assess the quality of the matching.

# Apply permutation matrix.
B_matched <- P %*% B %*% t(P)

# Visualize matched graphs.
gs.plot.plot_matrix(A, title="Original A Matrix", legend.name="Entries", ffactor = TRUE)
gs.plot.plot_matrix(B_matched, title="Matched B Matrix", legend.name="Entries", ffactor = TRUE)
gs.plot.plot_matrix(B, title="Original B Matrix", legend.name="Entries", ffactor = TRUE)

We can see the similarity between A and the matched B, and the dissimilarity between the original and matched B. Below, we mark the edge disagreements of our approximate solution.

gs.plot.plot_matrix(abs(A-B_matched), title="Disagreements", legend.name="Disagreement", ffactor = TRUE)

The overall connectivity of A is very closely recovered.

Example 2: r-Correlated Stochastic Blockmodel

Here, we sample (A, B) from an r-Correlated SBM via a random dot-product graph, permute the matrix like before, and observe the results of Seeded Graph Matching.

# Define latent vectors for 2-SBM RDPG in X.
n <- 10
X1 <- matrix(rep(c(0.85, 0), n/2), nrow = n/2, byrow = TRUE)
X2 <- matrix(rep(c(0.3,0.8), n/2), nrow = n/2, byrow = TRUE)
X <- rbind(X1, X2)
set.seed(6789)

# Pearson correlation coefficient.
r <- 0.75

# Sample r-SBM.
sampled_graphs <- rdpg.sample.correlated(X, r)
A <- as_adj(sampled_graphs[[1]], sparse = FALSE)
B <- as_adj(sampled_graphs[[2]], sparse = FALSE)

# Display overlap.
gs.plot.plot_matrix(A + B, title="A + B (Overlap)", legend.name="A_ij + B_ij")

These graphs are not exactly equal, but have high correlation. Their matrix addition, pictured above, has many 2 and 0 entries, and few 1 entries. We now choose the first m/2 = 1 nodes from each block of each graph to be the seeds.

# Identify seeds.
m <- 2
seed_indices <- c(1:(m/2), 1:(m/2)+n/2)
seeds <- matrix(c(seed_indices, seed_indices), nrow = m)

# Set up permutation.
block1_permutation <- sample(n/2 - m/2) + m/2
block2_permutation <- sample(n/2 - m/2) + m/2 + n/2
B_permutation <- c(1:(m/2), block1_permutation, 1:(m/2)+n/2, block2_permutation)

# Redefine B matrix and match using unordered sgm.
B_p <- B[B_permutation,]
P <- sgm(A, B_p, seeds)

# Apply permutation matrix.
B_matched <- P %*% B_p %*% t(P)

# Display.
gs.plot.plot_matrix(A, title="Original A Matrix", legend.name="Entries", ffactor = TRUE)
gs.plot.plot_matrix(B_matched, title="Matched B Matrix", legend.name="Entries", ffactor = TRUE)
gs.plot.plot_matrix(B_p, title="Original B Matrix", legend.name="Entries", ffactor = TRUE)

As before, we mark the edge disagreements below.

gs.plot.plot_matrix(abs(A-B_matched), title="Disagreements", legend.name="Disagreement", ffactor = TRUE)

While we allow more disagreements between A and matched B due to the random nature of the SBM, the within and between block structure of A is mirrored in B after seeded graph matching.



neurodata/graphstats documentation built on May 14, 2019, 5:19 p.m.