knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
library(keras) library(rspektral)
Here we will compare a number of different graph-based convolutional network methods for classifying documents using information about their citation networks. This is a node classification task, where documents are nodes in a citation networks, and features are the presence of different words (a bag-of-words).
First we will load a dataset of citations provided by the spektral
package. This comes in the form fo a list of a A
, and adjacency matrix for the citation network, X
as matrix of node-level features (the bag-of-words), y
, the labels we are trying to predict (corresponding to "topics"), and three different masks, used to set zero weights on some samples in order to remove them effectively from training.
## loads 'cora' dataset by default c(A, X, y, train_mask, val_mask, test_mask) %<-% dataset_citations()
Next we will setup some parameters that we will use in our deep graph models:
channels <- 16 # Number of channels in the first layer N <- dim(X)[1] # Number of nodes in the graph F <- dim(X)[2] # Original size of node features n_classes <- dim(y)[2] # Number of classes dropout <- 0.5 # Dropout rate for the features l2_reg <- 5e-4 / 2 # L2 regularization rate learning_rate <- 1e-2 # Learning rate epochs <- 300 # Number of training epochs es_patience <- 10 # Patience for early stopping
So, first we will setup one of the simpler graph neural networks, to show how this all works.
The first thing we generally need to do to get a graph neural network up and running is to preprocess our data into a form expected by whatever graph neural network layers we want to use. In this case we will use a few Graph Convolution layers. This accepts the graph in the form of a Laplacian matrix, which is a transformation of the adjacency matrix. We can generate this easily using a preprocess
function, which perform the necessary calculations for us. For a layer_graph_conv()
we use the preprocess_graph_conv()
function. We also will convert our X
matrix, which is currently a sparse matrix, into a dense matrix for this analysis.
fltr <- preprocess_graph_conv(A) X <- as.matrix(X) class(fltr) dim(fltr) dim(X)
Next we setup our model using keras
and the new layers provided by spekral
and accessed in R through rspektral
.
X_in <- layer_input(shape = c(F)) fltr_in <- layer_input(shape = c(N), sparse = TRUE) dropout_1 <- X_in %>% layer_dropout(rate = dropout) graph_conv_1 <- list(dropout_1, fltr_in) %>% layer_graph_conv(channels, activation = 'relu', kernel_regularizer = regularizer_l2(l2_reg), use_bias = FALSE) dropout_2 <- graph_conv_1 %>% layer_dropout(dropout) graph_conv_2 <- list(dropout_2, fltr_in) %>% layer_graph_conv(n_classes, activation = "softmax", use_bias = FALSE) model <- keras_model(inputs = list(X_in, fltr_in), outputs = graph_conv_2) summary(model)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.