dnc: Dynamic Network Clustering
In dnc: Dynamic Network Clustering

Description Usage Arguments Details Value References Examples

View source: R/DNCRPackageRCode.R

Perform dynamic network clustering. Either variational Bayes or a Gibbs sampler may be implemented. Setting M=0 performs variational Bayes with no clustering. Returns posterior parameters (if method="VB") or approximate posterior samples (if method="Gibbs"), as well as the MAP estimates, which may be extracted through dncObj$pm.

1
2
3

dnc(Y,M,p=3,method="VB",init=NULL,hyperparms=NULL,Missing=NULL,
    controls=list(MaxIt=500,epsilon=1e-5,MaxItStg2=100,
                  epsilonStg2=1e-15,nDraws=10000,burnin=1000))

`Y`	Dynamic network data. This should be in the form of a n x n x T array of 1's and 0's. Each slice corresponds to a single time point.
`M`	Number of communities (may be zero).
`p`	Dimension of the latent space.
`method`	Method of estimation, either "VB" for variational Bayes, or "Gibbs" for a Gibbs sampler.
`init`	(Use of this argument is not recommended) Initial values of the parameters. A named list containing `EOm`, `mu`, `Sig`, `Bi0g`, `Bitbar`, `Bithk`, `Er`, `Er2`, `ai2`, `bi2`, `nu`, `a3`, `b3`, `Es`, `Es2`, and `Gam`.
`hyperparms`	Hyperparameters. A named list with `cc`, `a0Star`, `b0Star`, `a2Star`, `b2Star`, `b3Star`, `GamStar`.
`Missing`	A matrix whose rows correspond to missing dyads. `Missing` should have three columns: row, column, and time (i.e., the indices for the NA's in `Y`). May be left as `NULL` if the missing dyads in `Y` are `NA`'s.
`controls`	A list of values to control the algorithm. MaxIt The total number of iterations for the VB algorithm. Ignored if `method="Gibbs"` unless `M=0`. epsilon Relative tolerance criteria for evaluating convergence. MaxItStg2 The total number of iterations for the second stage initialization of the VB algorithm/Gibbs sampler. Ignored if `M=0`. epsilonStg2 Relative tolerance criteria for evaluating convergence for the second stage initialization of the VB algorithm/Gibbs sampler. Ignored if `M=0`. nDraws Total number of post-burn-in samples to be drawn via the Gibbs sampler. Ignored if `method="VB"`. burnin The number of burn-in samples. Ignored if `method="VB"`.

This function performs community detection according to the model

logit(P(Y_{ijt} =1)) = α + s_j X_{it}'X_{jt}

,

π(X_{it}|Z_{it}=m) = N(r_{i}*u_{m},τ_{i}^{-1}I_p).

While the latent positions, X_{it}'s, live in a p-dim Euclidean space, it is more natural to conceptualize these as living on a (hyper-) sphere with the magnitude of the X_{it}'s as attached attributes that reflect the actors' individual tendency to send and receive edges.

If M=0, then the prior on X_{it} is given by

π(X_{i1}) = N(0,σ^2 I_p)

π(X_{it}|X_{i(t-1)}) = N(X_{i(t-1)},τ_i^{-1} I_p)

The variational Bayes approach is typically faster than the Gibbs sampler, but tends to underestimate the spread of the posterior.

Currently, only VB is implemented when M=0 (no clustering), hence method will be ignored if M=0.

Ignorable missing data can be estimated within the Gibbs sampler (not using the VB algorithm) by adding the extra step of drawing the missing edges given the latent positions and the model parameters at each iteration.

Using the init is, in general, strongly discouraged, as this may have a non-negligible negative affect on the performance of the VB or the length of the chain needed to reach convergence. Unless otherwise specified, both the initialization scheme and the hyperparameters are chosen according to Sewell and Chen (2016).

An object of class dnc, for which other methods exist (e.g., methods(class="dnc")).

If method="VB" and M=0,

method: The estimation algorithm
Y: The original data
mu: A p x T x n array: Posterior mean of the latent positions
Sig: A (Tp) x p x n array: Posterior covariance matrices of the latent positions. The covariance matrix for X_{it} is dncObj$Sig[(t-1)*p,,i]
a0: Scalar: Posterior shape parameter for σ^2 in inverse gamma distribution (if M=0).
b0: Scalar: Posterior scale parameter for σ^2 in inverse gamma distribution (if M=0).
ai1: A n x 1 vector: Posterior mean parameter for the r_i's in truncated normal distribution (if M>0).
bi1: A n x 1 vector: Posterior variance parameter for the r_i's in truncated normal distribution (if M>0).
Er: A n x 1 vector: Posterior first moment for the r_i's (if M>0).
Er2: A n x 1 vector: Posterior second moment for the r_i's (if M>0).
ai2: A n x 1 vector: Posterior shape parameter for the τ_i's in gamma distribution.
bi2: A n x 1 vector: Posterior scale parameter for the τ_i's in gamma distribution.
a3: Scalar: Posterior mean for α.
b3: Scalar: Posterior variance for α.
ai4: A n x 1 vector: Posterior mean parameter for the s_j's in truncated normal distribution.
bi4: A n x 1 vector: Posterior variance parameter for the s_j's in truncated normal distribution.
Es: A n x 1 vector: Posterior first moment for the s_j's.
Es2: A n x 1 vector: Posterior second moment for the s_j's.
nu: A M x p matrix: Posterior mean directions for the M clusters/communities, i.e., for the u_m's (if M>0).
kappa: A M x 1 vector: Posterior concentration parameters for the M clusters/communities, i.e., for the u_m's (if M>0).
Z: A n x T matrix: Cluster assignments based on the maximum posterior probabilities, computed marginally at each time point (if M>0).
Bi0g: A n x M matrix: Posterior probabilities of community assignment for each actor at the first observed time point (if M>0).
Bithk: A (MT) x M x n array: Posterior transition probability matrices; π(Z_{itk}=1|Z_{i(t-1)h}=1,Y)= dncObj$Bithk[(t-1)*M+h,k,i]. Ignore first M lines (internal use only). (if M>0).
Bitbar: A T x M x n array: Marginal posterior probabilities of community assignments, i.e., π(Z_{itk}=1|Y)= dncObj$Bitbar[t,k,i] (if M>0).
Gam: A (M+1) x M matrix: Posterior concentration parameters for β_0 (row 1) and for β_m, m>1 (rows 2 to M+1) in Dirichlet distribution (if M>0).

If method="Gibbs",

method: The estimation algorithm
Y: The original data
X: A p x T x n x nDraws array: Posterior samples for the latent positions.
r: A n x nDraws matrix: Posterior samples for the r_i's.
tau: A n x nDraws matrix: Posterior samples for the τ_i's.
alpha: A nDraws x 1 vector: Posterior samples for α.
s: A n x nDraws matrix: Posterior samples for the τ_i's.
u: A M x p x nDraws array: Posterior draws for the communities, i.e., the u_m's.
Z: A n x T x nDraws array: Posterior draws for the community assignments for each actor at each time point.
beta: A (M+1) x M x n array: Posterior draws for beta_0 (row 1) and β_m, m>1 (rows 2 to M+1).
posterior: A (burnin+nDraws) x 1 vector: Posterior values for all iterations of the Gibbs sampler.
Missing: A matrix of four columns: The row, column, and time for each missing dyad, as well as the posterior probability that the dyad equals one.

Additionally, each dnc class object comes with a $pm value, which is a list of the MAP estimates for alpha, X, s, tau, r, u, Z, and beta.

Sewell, D. K., and Chen, Y. (2016). Latent Space Approaches to Community Detection in Dynamic Networks. Bayesian Analysis. doi: 10.1214/16-BA1000. http://projecteuclid.org/euclid.ba/1461603847

  data(friendship)
  set.seed(123)
  dncObj <- dnc(friendship,M=4,p=3,method="Gibbs",
                controls=list(nDraws=250,burnin=50,
                              MaxItStg2=25,epsilonStg2=1e-15))
  print(dncObj)
  BIC(dncObj)
  par(mar=rep(0,4)+0.05)
  plot(dncObj,plotRGL=FALSE,pch=16,phi=60,lwd=2,cex=1.5)