In neurodata/graphstats: Graph Statistics

require(graphstats)
require(igraph)
require(ggplot2)

Background

Given two networks that we believe to have some correspondence, we may want to match the nodes of one network to those of the other; that is, we want to find the bijection between the vertex sets that minimizes the number of edge disagreements across the graph. An edge disagreement occurs when two vertices are adjacent on in one graph but their corresponding vertices in the other are not.

Example 1: A Known Correspondence

Here, we produce a a simple random graph (adjacency matrix) A with n = 10 and p = 0.5. We then produce B by maintaining the first m = 3 rows (the seeds), and permuting the remaining n - m rows. The sgm function can be called in two ways.

Unordered: Call sgm and supply seeds - an m by 2 matrix that contains the indices of the seeds in A and the indices of their corresponding nodes in B.
Ordered: Call sgm.ordered and supply m - the number of seeds. It is assumed that the first m nodes in each adjacency matrix are the matched seeds, in order. We also supply start - a square n-m by n-m matrix that is the initial value of the permutation matrix for the non-seeds.

# Number of seeds, total nubmer of vertices, and edge probability.
m <- 3
n <- 10
p <- 0.5
set.seed(12345)

# Sample matrix A, and permute to .
A <- matrix(as_adj(sample_sbm(n, p, n)), nrow = n)
B <- A[c(1:m, sample(n-m)+m),]

# P is the permutation matrix that will turn match B to A.

# Unordered call.
seeds <- matrix(cbind(1:m, 1:m), nrow = m)
P <- sgm(A, B, seeds)

# Ordered call.
start <- diag(n-m)[sample(n-m),]
P <- sgm.ordered(A, B, m, start)

# Display.
gs.plot.plot_matrix(P, title="Permutation Matrix", legend.name="Entries", ffactor = TRUE)

Here, we have the first m nodes along the diagonal, unpermuted. (NOTE: The matrix heatmaps in this demo have the diagonal ranging from the bottom left to the rop right.) By applying the returned matrix to B, we can assess the quality of the matching.

# Apply permutation matrix.
B_matched <- P %*% B %*% t(P)

# Visualize matched graphs.
gs.plot.plot_matrix(A, title="Original A Matrix", legend.name="Entries", ffactor = TRUE)
gs.plot.plot_matrix(B_matched, title="Matched B Matrix", legend.name="Entries", ffactor = TRUE)
gs.plot.plot_matrix(B, title="Original B Matrix", legend.name="Entries", ffactor = TRUE)

We can see the similarity between A and the matched B, and the dissimilarity between the original and matched B. Below, we mark the edge disagreements of our approximate solution.

gs.plot.plot_matrix(abs(A-B_matched), title="Disagreements", legend.name="Disagreement", ffactor = TRUE)

The overall connectivity of A is very closely recovered.

Example 2: r-Correlated Stochastic Blockmodel

Here, we sample (A, B) from an r-Correlated SBM via a random dot-product graph, permute the matrix like before, and observe the results of Seeded Graph Matching.

# Define latent vectors for 2-SBM RDPG in X.
n <- 10
X1 <- matrix(rep(c(0.85, 0), n/2), nrow = n/2, byrow = TRUE)
X2 <- matrix(rep(c(0.3,0.8), n/2), nrow = n/2, byrow = TRUE)
X <- rbind(X1, X2)
set.seed(6789)

# Pearson correlation coefficient.
r <- 0.75

# Sample r-SBM.
sampled_graphs <- rdpg.sample.correlated(X, r)
A <- as_adj(sampled_graphs[[1]], sparse = FALSE)
B <- as_adj(sampled_graphs[[2]], sparse = FALSE)

# Display overlap.
gs.plot.plot_matrix(A + B, title="A + B (Overlap)", legend.name="A_ij + B_ij")

These graphs are not exactly equal, but have high correlation. Their matrix addition, pictured above, has many 2 and 0 entries, and few 1 entries. We now choose the first m/2 = 1 nodes from each block of each graph to be the seeds.

# Identify seeds.
m <- 2
seed_indices <- c(1:(m/2), 1:(m/2)+n/2)
seeds <- matrix(c(seed_indices, seed_indices), nrow = m)

# Set up permutation.
block1_permutation <- sample(n/2 - m/2) + m/2
block2_permutation <- sample(n/2 - m/2) + m/2 + n/2
B_permutation <- c(1:(m/2), block1_permutation, 1:(m/2)+n/2, block2_permutation)

# Redefine B matrix and match using unordered sgm.
B_p <- B[B_permutation,]
P <- sgm(A, B_p, seeds)

# Apply permutation matrix.
B_matched <- P %*% B_p %*% t(P)

# Display.
gs.plot.plot_matrix(A, title="Original A Matrix", legend.name="Entries", ffactor = TRUE)
gs.plot.plot_matrix(B_matched, title="Matched B Matrix", legend.name="Entries", ffactor = TRUE)
gs.plot.plot_matrix(B_p, title="Original B Matrix", legend.name="Entries", ffactor = TRUE)

As before, we mark the edge disagreements below.

gs.plot.plot_matrix(abs(A-B_matched), title="Disagreements", legend.name="Disagreement", ffactor = TRUE)

While we allow more disagreements between A and matched B due to the random nature of the SBM, the within and between block structure of A is mirrored in B after seeded graph matching.

neurodata/graphstats documentation built on May 14, 2019, 5:19 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

Tweet to @rdrrHQ

GitHub issue tracker

ian@mutexlabs.com