rgbn: Draw from a Skvoretz-Fararo Biased Net Process

View source: R/randomgraph.R

rgbnR Documentation

Draw from a Skvoretz-Fararo Biased Net Process

Description

Produces a series of draws from a Skvoretz-Fararo biased net process using a Markov chain Monte Carlo or exact sampling procedure.

Usage

rgbn(n, nv, param = list(pi=0, sigma=0, rho=0, d=0.5, delta=0, 
    epsilon=0), burn = nv*nv*5*100, thin = nv*nv*5, maxiter = 1e7,
    method = c("mcmc","cftp"), dichotomize.sib.effects = FALSE,
    return.as.edgelist = FALSE, seed.graph = NULL, max.density = 1)

Arguments

n

number of draws to take.

nv

number of vertices in the graph to be simulated.

param

a list containing the biased net parameters (as described below); d and epsilon may be given as scalars or as nv x nv matrices of edgewise event probabilities.

burn

for the MCMC, the number of burn-in draws to take (and discard).

thin

the thinning parameter for the MCMC algorithm.

maxiter

for the CFTP method, the number of iterations to try before giving up.

method

"mcmc" for the MCMC, or "cftp" for the exact sampling procedure.

dichotomize.sib.effects

logical; should sibling and double role effects be dichotomized?

return.as.edgelist

logical; should the simulated draws be returned in edgelist format?

seed.graph

optionally, an initial state to use for MCMC.

max.density

optional maximum density threshold for MCMC; if the chain encounters a graph of higher than max density, the chain is terminated (and the result flagged).

Details

The biased net model stems from early work by Rapoport, who attempted to model networks via a hypothetical “tracing” process. This process may be described loosely as follows. One begins with a small “seed” set of vertices, each member of which is assumed to nominate (generate ties to) other members of the population with some fixed probability. These members, in turn, may nominate new members of the population, as well as members who have already been reached. Such nominations may be “biased” in one fashion or another, leading to a non-uniform growth process.

While the original biased net model depends upon the tracing process, a local (conditional) interpretation was put forward by Skvoretz and colleagues (2004). Using a four-parameter model, they propose approximating the conditional probability of an (i,j) edge given all other edges in a random graph G by

\Pr(i \to j|G_{-ij}) \approx 1 - (1-\rho)^z (1-\sigma)^y (1-\pi)^x (1-d_{ij})

where x=1 iff j \to i (and 0 otherwise), y is the number of vertices k \neq i,j such that k \to i, k \to j, and z=1 iff x=1 and y>0 (and 0 otherwise). Thus, x is the number of potential parent bias events, y is the number of potential sibling bias events, and z is the number of potential double role bias events. d_{ij} is the probability of the baseline edge event; note that an edge arises if the baseline event or any bias event occurs, and all events are assumed conditionally independent. Written in this way, it is clear that the edges of G are conditionally independent if they share no endpoint. Thus, a model with the above structure should be a subfamily of the Markov graphs.

One problem with the above structure is that the hypothetical probabilities implied by the model are not in general consistent - that is, there exist conditions under which there is no joint pmf for G with the implied full conditionals. The interpretation of the above as exact conditional probabilities is thus potentially problematic. However, a well-defined process can be constructed by interpreting the above as transition probabilities for a Markov chain that evolves by updating a randomly selected edge variable at each time point; this is a Gibbs sampler for the implied joint pmf where it exists, and otherwise an irreducible and aperiodic Markov chain with a well-defined equilibrium distribution (Butts, 2018).

In the above process, all events act to promote the formation of edges; it is also possible to define events that inhibit them (Butts, 2024). Let an inhibition event be one that, if it occurs, forbids the creation of an i \to j. As with d, we may specify a total probability \epsilon_{ij} that such an event occurs exogenously for the i,j edge. We may also specify endogenous inhibition events. For instance, consider a satiation event, which has the potential to occur every time i emits an edge to some other vertex; each existing edge has a chance of triggering “satiation,” in which case the focal edge is inhibited. The associated approximate conditional (i.e., transition probability) with these effects is then

\Pr(i \to j|G_{-ij}) \approx (1-\epsilon_{ij}) (1-\delta)^w\left(1 - (1-\rho)^z (1-\sigma)^y (1-\pi)^x (1-d_{ij})\right)

where w is the outdegree of i in G_{-ij} and \delta is the probability of the satiation event. The net effect of satiation is to suppress edge formation (in roughly geometric fashion) on high degree nodes. This may be useful in preventing degeneracy when using sigma and rho effects. Degeneracy can also be reduced by employing the dichotomize.sib.effects argument, which counts only the first shared partner's contribution towards sibling and double role effects.

It should be noted that the above process is not entirely consistent with the tracing-based model, which is itself not uniformly well-specified in the literature. For this reason, the local model is referred to here as a Skvoretz-Fararo or Markovian biased net graph process. One significant advantage of this process is that it is well-defined, and easily simulated: the above equation can be used to form the transition rule for a Markov chain Monte Carlo algorithm, which is used by rgbn to take draws from the (local) biased net model. (Note that while the underlying Markov chain is only a Gibbs sampler in the special cases for which the putative conditional distributions are jointly satisfiable, it always can be interpreted as simulating draws from the equilibrium distribution of a SF/MBN graph process.) Burn-in and thinning are controlled by the corresponding arguments; since degeneracy is common with models of this type, it is advisable to check for adequate mixing. An alternative simulation strategy is the exact sampling procedure of Butts (2018), which employs a form of coupling from the past (CFTP). The CFTP method generates exact, independent draws from the equilibrium distribution of the biased net process (up to numerical limits), but can be slow to attain coalescence (and does not currently support satiation events or other inhibition events). Setting maxiter to smaller values limits the search depth employed, at the possible cost of biasing the resulting sample. An initial condition may be specified for the MCMC using the seed.graph; if not specified, the empty graph is used.

For some applications (e.g., ABC rejection sampling), it can be useful to terminate simulation if the density is obviously too high for the draw to be useful. (Compare to similar functionality in the ergm “density guard” feature.) This can be invoked for the MCMC algorithm by setting the max.density less than 1. In this case, the chain is terminated as soon as the threshold density is reached. The resulting object is marked with an attribute called early.termination with a value of TRUE, which should obviously be checked if this feature is used (since the terminated draws are not from the target distribution - especially if n>1!). This feature cannot be used with CFTP, and is ignored when CFTP is selected.

Value

An adjacency array or list of sna edgelists containing the simulated graphs.

Author(s)

Carter T. Butts buttsc@uci.edu

References

Butts, C.T. (2018). “A Perfect Sampling Method for Exponential Family Random Graph Models.” Journal of Mathematical Sociology, 42(1), 17-36.

Butts, C.T. (2024). “A Return to Biased Nets: New Specifications and Approximate Bayesian Inference.” Journal of Mathematical Sociology.

Rapoport, A. (1957). “A Contribution to the Theory of Random and Biased Nets.” Bulletin of Mathematical Biophysics, 15, 523-533.

Skvoretz, J.; Fararo, T.J.; and Agneessens, F. (2004). “Advances in Biased Net Theory: Definitions, Derivations, and Estimations.” Social Networks, 26, 113-139.

See Also

bn

Examples

#Generate draws with low density and no biases
g1<-rgbn(50,10,param=list(pi=0, sigma=0, rho=0, d=0.17))
apply(dyad.census(g1),2,mean) #Examine the dyad census

#Add a reciprocity bias
g2<-rgbn(50,10,param=list(pi=0.5, sigma=0, rho=0, d=0.17))
apply(dyad.census(g2),2,mean) #Compare with g1

#Alternately, add a sibling bias
g3<-rgbn(50,10,param=list(pi=0.0, sigma=0.3, rho=0, d=0.17))
mean(gtrans(g3))              #Compare transitivity scores
mean(gtrans(g1))

#Create a two-group model with homophily
x<-rbinom(30,1,0.5)           #Generate group labels
d<-0.02+outer(x,x,"==")*0.2   #Set base tie probability
g4<-rgbn(1,30,param=list(pi=0.25, sigma=0.02, rho=0, d=d))
gplot(g4, vertex.col=1+x)     #Note the group structure

#Create a two-group model where cross-group ties are inhibited
x<-rbinom(30,1,0.5)           #Generate group labels
ep<-outer(x,x,"!=")*0.75      #Set inhibition probability
g5<-rgbn(1,30,param=list(pi=0.5, sigma=0.05, rho=0, d=0.1, 
    epsilon=ep))
gplot(g5, vertex.col=1+x)     #Note the group structure


sna documentation built on Sept. 11, 2024, 8:45 p.m.

Related to rgbn in sna...