gof: Posterior predictive checks using structural network...

gofR Documentation

Posterior predictive checks using structural network charactericts

Description

The function generates a variety of plots that serve as posterior predictive checks on the goodness of fit of a fitted mmsbm object.

Usage

gof(x, ...)

## S3 method for class 'mmsbm'
gof(
  x,
  gof_stat = c("Geodesics", "Degree"),
  level = 0.95,
  samples = 50,
  new.data.dyad = NULL,
  new.data.monad = NULL,
  seed = NULL,
  ...
)

Arguments

x

An object of class mmsbm, a result of a call to mmsbm.

...

Currently ignored.

gof_stat

Character vector. Accepts any subset from "Geodesics","Degree", "Indegree", "Outdegree", "3-Motifs", "Dyad Shared Partners", "Edge Shared Partners", and "Incoming K-stars". See details.

level

Double. Level of credible interval for posterior predictive distribution around structural quantities of interest.

samples

Integer. Number of sampled networks from model's posterior predictive using simulate.mmsbm.

new.data.dyad

See simulate.mmsbm. Enables out-of-sample checking.

new.data.monad

See simulate.mmsbm. Enables out-of-sample checking.

seed

See simulate.mmsbm.

Details

Goodness of fit of network models has typically been established by evaluating how the structural characteristics of predicted networks compare to those of the observed network. When estimated in a Bayesian framework, this approach is equivalent to conducting posterior preditive checks on these structural quantities of interest. When new.data.dyad and/or new.data.monad are passed that are different from those used in estimation, this is equivalent to conducting posterior predictive checks out-of-sample.

The set of structural features used to determine goodness of fit is somewhat arbitrary, and chosen mostly to incorporate various first order, second order, and (to the extent possible) third-order characteristics of the network. "Geodesics" focuses on the distribution over observed and predicted geodesic distances between nodes; "Indegree" and "Outdegree" focuses on the distribution over incoming and outgoing connections per node; "3-motifs" focus on a distribution over possible connectivity patterns between triads (i.e. the triadic census); "Dyad Shared Partners" focuses on the distribution over the number of shared partners between any two dayds; "Edge Shared Partners" is similarly defined, but w.r.t. edges, rather than dyads; and finally "Incoming K-stars" focuses on a frequency distribution over stars with k=1,... spokes.

Obtaining samples of the last three structural features can be very computationally expensive, and is discouraged on networks with more than 50 nodes.

Value

A ggplot object.

Author(s)

Santiago Olivella (olivella@unc.edu), Adeline Lo (aylo@wisc.edu), Tyler Pratt (tyler.pratt@yale.edu), Kosuke Imai (imai@harvard.edu)

Examples

library(NetMix)
## Load datasets
data("lazega_dyadic")
data("lazega_monadic")

## Estimate model with 2 groups
lazega_mmsbm <- mmsbm(SocializeWith ~ Coworkers,
                      senderID = "Lawyer1",
                      receiverID = "Lawyer2",
                      nodeID = "Lawyer",
                      data.dyad = lazega_dyadic,
                      data.monad = lazega_monadic,
                      n.blocks = 2,
                      mmsbm.control = list(seed = 123,
                                           conv_tol = 1e-2,
                                           hessian = FALSE))

## Plot observed (red) and simulated (gray) distributions over 
## indegrees
## (typically a larger number of samples would be taken) 
## (strictly requires ggplot2)


gof(lazega_mmsbm, gof_stat = "Indegree", samples = 2)



NetMix documentation built on May 29, 2024, 6:39 a.m.