Mesa High Vignette
In fergm: Estimation and Fit Assessment of Frailty Exponential Random Graph Models

Vignette Info and Introduction

This vignette uses the faux.mesa.high network from the ergm package [@hunter2008ergm] to provide an example of the basic functionality of the fergm package [@box2018modeling]. In particular, it walks through the estimation of a well-fitting Exponential Random Graph (ERGM) model on the simulated "Mesa High" network. Once estimated, the corresponding Frailty Exponential Random Graph (FERGM) model is estimated and compared to the comparable ERGM. The documentation for each individual function includes chunks of code taken from this vignette.

It is worth noting that many of these functions are built upon the rstan package [@guo2016rstan]. Given that the output of the fergm function is a stanfit object, one may use much of the built-in functionality of the rstan package should they be familiar with the package.

This vignette proceeds by walking through an example step-by-step to illustrate a representative workflow for using the FERGM package.

Note: The code presented here is not built given the runtime for the FERGM. As such, the code presented is purely illustrative. It will successfully execute, however.

Importing Mesa High Network

The following chunk of code imports the Mesa High network generously provided in the ergm package [@hunter2008ergm].

# Load statnet which contains the ergm package and the faux.mesa.high network.
library(statnet)

# Set seed for replication
set.seed(1)

# Load faux.mea.high daa
data("faux.mesa.high")

# Rename the object
mesa <- faux.mesa.high

ERGM and FERGM Estimation

The second step, once the data is imported, is to estimate a well-fitting ERGM and its FERGM counterpart. The fergm function takes at least two arguments, the network object to have an FERGM fit on and a character string formula containing undirected ergm-terms. A variety of other function arguments can be specified to override defaults, including the seed, the number of chains, number of warmup iterations, total number of iterations, and the number of cores used. The fergm function returns a named list of two objects: fergm$stan.dta returns the data object handed to Stan and fergm$stan.fit is the stanfit object returned. The following chunk of code handles this.

# ERGM fit
ergm.fit <- ergm(mesa ~ edges +
                   nodematch('Sex') +
                   nodematch('Grade', diff = FALSE) +
                   nodematch('Race', diff = FALSE) +
                   gwesp(decay = 0.2, fixed = TRUE) +
                   altkstar(lambda = 0.6, fixed = TRUE))

# FERGM fit
library(fergm)
form <- c("edges + nodematch('Sex') + nodematch('Grade', diff = FALSE) +
        nodematch('Race', diff = FALSE) + gwesp(decay = 0.2, fixed = TRUE) + 
        altkstar(lambda = 0.6, fixed = TRUE)")

fergm.fit <- fergm(net = mesa, form = form, chains = 1)

Summarizing FERGM Output

While the fergm package does not contain a built in summary function for the FERGM, there are several means to summarize fergm output. One way to do so cleanly is to use the built-in clean_summary function. This function takes two objects: the output of the fergm function and a vector of custom character names for each coefficient. If a vector of custom chaaracter names is not provided then the function inherits the formula used in the fergm call, which holds true for other functions including the custom_var_names argument.

In addition, we provide a built-in function to create coefficient plots: coef_plot. This function takes either an fergm object to plot the FERGM coefficients or both an fergm and ergm object to plot and compare the coefficients.

We also include a built-in function to plot the densities for each coefficient of interest, coef_posterior_density. The code is as follows:

# Conventional rstan approach to extracting posterior summary
stan.smry <- summary(fergm.fit$stan.fit)$summary
beta_df <- stan.smry[grep("beta", rownames(stan.smry)),]
est <- round(beta_df[,c(1,4,8)], 3)
est # in order of "form"

# fergm built-in function to summarize posteior
est <- clean_summary(fergm.fit)
est <- clean_summary(fergm.fit, 
                     custom_var_names = c("Edges", "Sex Homophily",
                                          "GradeHomophily", "Race Homophily",
                                          "GWESP", "Alternating K-Stars"))
est
# Compare substantive implications via coef plot, these are with 95% credible intervals
coef_plot(fergm.fit = fergm.fit)

coef_plot(fergm.fit = fergm.fit, 
          custom_var_names =  c("Edges", "Sex Homophily", "Grade Homophily", 
                                "Race Homophily", "GWESP", "Alternating K-Stars"))
coef_plot(fergm.fit = fergm.fit, 
          ergm.fit = ergm.fit, 
          custom_var_names =  c("Edges", "Sex Homophily", "Grade Homophily", 
                                "Race Homophily", "GWESP", "Alternating K-Stars"))


# You can also look at the density of particular variables using the following:

densities <- coef_posterior_density(fergm.fit = fergm.fit)

densities <- coef_posterior_density(fergm.fit = fergm.fit, 
                                    custom_var_names = c("Edges", "Sex Homophily", 
                                                         "Grade Homophily", "Race Homophily", 
                                                         "GWESP", "Alternating K-Stars"))
densities[[1]]
densities[[2]]

There are also a series of rstan functions to summarize posterior distributions, and we would refer the reader to the rstan manual should they be interested in learning their alternatives.

FERGM Diagnostics

To visually check whether there is evidence that the chains used to estimate the FERGM have converged, traceplots may be used. While rstan has a built-in traceplot function that could easily be used, changes may be desired to produce publication-quality graphics. As such, building upon the rstan::traceplot() function, we provide a cleaner alternative. The code is as follows:

# Use rstan functions to assess whether chains have evidence of converging
trace <- rstan::traceplot(fergm.fit$stan.fit, pars = "beta")
trace

# We have our own version that includes variable names and tidies it up a bit
fergm_beta_traceplot(fergm.fit)

fergm_beta_traceplot(fergm.fit,
                     custom_var_names =  c("Edges", "Sex Homophily", 
                                           "Grade Homophily", "Race Homophily", 
                                           "GWESP", "Alternating K-Stars"))

Additional diagnostics are available using the rstan package, and we would refer the reader to their well-written manual for further diagnostics or details on their built-in diagnostic plots.

Compare ERGM and FERGM Fit

One might be interested in the relative difference in predictive performance between an ERGM and an FERGM. While the coef_plot() function offers an opportunity to compare the coefficients of these two models, it does not provide a means to assess the relative fit of each model. Relative fit is examined through simulating a number of networks based upon the model results and examining the average number of correctly predicted ties across all simulated networks. This routine is described by the manuscript presenting the model [@box2018modeling]. Note, specifying an object named net prior to estimating the compare_predictions function is necessary to produce a valid network on the left-hand side of the simulate method for ERGMs. The code to perform this routine is as follows:

# Use fergm built in compare_predictions function to compare predictions of ERGM and FERGM
net <- ergm.fit$network
predict_out <- compare_predictions(ergm.fit = ergm.fit, fergm.fit = fergm.fit, replications = 100)

# Use the built in compare_predictions_plot function to examine the densities of correctly predicted
  # ties from the compare_predictions simulations
compare_predictions_plot(predict_out)

# We can also conduct a KS test to determine if the FERGM fit it statistically disginguishable 
  # from the ERGM fit
compare_predictions_test(predict_out)