mcmcARD: Estimate network model using ARD

Description Usage Arguments Details Value References Examples

View source: R/mcmcARD.R

Description

mcmcARD estimates the network model proposed by McCormick and Zheng (2015).

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
mcmcARD(
  Y,
  traitARD,
  start,
  fixv,
  consb,
  iteration = 2000L,
  sim.d = TRUE,
  sim.zeta = TRUE,
  hyperparms = NULL,
  ctrl.mcmc = list()
)

Arguments

Y

is a matrix of ARD. The entry (i, k) is the number of i's friends having the trait k.

traitARD

is the matrix of traits for individuals with ARD. The entry (i, k) is equal to 1 if i has the trait k and 0 otherwise.

start

is a list containing starting values of 'z' (matrix of dimension N \times p), 'v' (matrix of dimension K \times p), 'd' (vector of dimension N), 'b' (vector of dimension K), 'eta' (vector of dimension K) and 'zeta' (scalar).

fixv

is a vector setting which location parameters are fixed for identifiability. These fixed positions are used to rotate the latent surface back to a common orientation at each iteration using a Procrustes transformation (see McCormick and Zheng, 2015; Breza et al., 2017 and details).

consb

is a vector of the subset of bk constrained to the total size (see McCormick and Zheng, 2015; Breza et al., 2017 and details).

iteration

is the number of MCMC steps to be performed.

sim.d

is logical indicating weather the degree 'd' will be updated in the MCMC. If 'sim.d = FALSE', the starting value of 'd' in the argument 'start' is set fixed along the process.

sim.zeta

is logical indicating weather the degree 'zeta' will be updated in the MCMC. If 'sim.zeta = FALSE', the starting value of 'zeta' in the argument 'start' is set fixed along the process.

hyperparms

is an 8-dimensional vector of hyperparameters such that, mud, sigmad, mub, sigmab, alphaeta, betaeta, alphazeta and betazeta (see details).

ctrl.mcmc

is a list of MCMC controls (See details).

Details

The linking probability is given by

Pij is proportional to (nui + nuj + zeta * zi * zj).

McCormick and Zheng (2015) write the likelihood of the model with respect to the spherical coordinate zi, the trait locations vk, the degree di, the fraction of ties in the network that are made with members of group k bk, the trait intensity parameter etak and zeta. The following prior distributions are defined.

zi ~ Uniform von Mises Fisher

vk ~ Uniform von Mises Fisher

di ~ log-Normal(mud, sigmad)

bk ~ log-Normal(mub, sigmab)

etak ~ Gamma(alphaeta, betaeta)

zeta ~ Gamma(alphazeta, betazeta)


For identification, some vk and bk need to be exogenously fixed around their given starting value (see McCormick and Zheng, 2015 for more details). The parameter 'fixv' can be used to set the desired value for vk while 'fixb' can be used to set the desired values for bk.

During the MCMC, the jumping scales are updated following Atchadé and Rosenthal (2005) in order to target the acceptance rate of each parameter to the 'target' values. This requires to set minimal and maximal jumping scales through the parameter 'ctrl.mcmc'. The parameter 'ctrl.mcmc' is a list which can contain the following named components.

Value

A list consisting of:

n

dimension of the sample with ARD.

K

number of traits.

p

hypersphere dimension.

time

elapsed time in second.

iteration

number of MCMC steps performed.

simulations

simulations from the posterior distribution.

hyperparms

return value of hyperparameters (updated and non updated).

accept.rate

list of acceptance rates.

start

starting values.

ctrl.mcmc

return value of 'ctrl.mcmc'.

References

Atchadé, Y. F., & Rosenthal, J. S. (2005). On adaptive markov chain monte carlo algorithms. Bernoulli, 11(5), 815-828. https://projecteuclid.org/euclid.bj/1130077595.

Breza, E., Chandrasekhar, A. G., McCormick, T. H., & Pan, M. (2020). Using aggregated relational data to feasibly identify network structure without network data. American Economic Review, forthcoming. https://arxiv.org/abs/1703.04157

McCormick, T. H., & Zheng, T. (2015). Latent surface models for networks using Aggregated Relational Data. Journal of the American Statistical Association, 110(512), 1684-1695. https://amstat.tandfonline.com/doi/full/10.1080/01621459.2014.991395.

Examples

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
## Not run: 
# Sample size
N       <- 500 

# ARD parameters
genzeta <- 1
mu      <- -1.35
sigma   <- 0.37
K       <- 12    # number of traits
P       <- 3     # Sphere dimension 


# Generate z (spherical coordinates)
genz    <- rvMF(N,rep(0,P))

# Generate nu  from a Normal distribution with parameters mu and sigma (The gregariousness)
gennu   <- rnorm(N,mu,sigma)

# compute degrees
gend <- N*exp(gennu)*exp(mu+0.5*sigma^2)*exp(logCpvMF(P,0) - logCpvMF(P,genzeta))

# Link probabilities
Probabilities <- sim.dnetwork(gennu,gend,genzeta,genz) 

# Adjacency matrix
G <- sim.network(Probabilities)

# Generate vk, the trait location
genv <- rvMF(K,rep(0,P))

# set fixed some vk  distant
genv[1,] <- c(1,0,0)
genv[2,] <- c(0,1,0)
genv[3,] <- c(0,0,1)

# eta, the intensity parameter
geneta   <-abs(rnorm(K,2,1))

# Build traits matrix
densityatz       <- matrix(0,N,K)
for(k in 1:K){
  densityatz[,k] <- dvMF(genz,genv[k,]*geneta[k])
}
trait       <- matrix(0,N,K)

for(k in 1:K){
  trait[,k] <- densityatz[,k]>sort(densityatz[,k],decreasing = T)[runif(1,0.05*N,0.25*N)]
}
# print a percentage of people having a trait
colSums(trait)*100/N
  
# Build ARD
ARD         <- G\
  
# generate b
genb        <- numeric(K)
for(k in 1:K){
   genb[k]   <- sum(G[,trait[,k]==1])/sum(G)
}
  
############ ARD Posterior distribution ################### 
# initialization 
d0     <- exp(rnorm(N)); b0 <- exp(rnorm(K)); eta0 <- rep(1,K);
zeta0  <- 05; z0 <- matrix(rvMF(N,rep(0,P)),N); v0 <- matrix(rvMF(K,rep(0,P)),K)
  
# We need to fix some of the vk and bk for identification (see Breza et al. (2020) for details).
vfixcolumn      <- 1:5
bfixcolumn      <- c(3, 5)
b0[bfixcolumn]  <- genb[bfixcolumn]
v0[vfixcolumn,] <- genv[vfixcolumn,]
start  <- list("z" = z0, "v" = v0, "d" = d0, "b" = b0, "eta" = eta0, "zeta" = zeta0)
  
# MCMC
out   <- mcmcARD(Y = ARD, traitARD = trait, start = start, fixv = vfixcolumn,
                consb = bfixcolumn, iteration = 5000)
                   
# plot simulations
# plot d
plot(out$simulations$d[,4], type = "l", col = "blue", ylab = "")
abline(h = gend[4], col = "red")
  
# plot coordinates of individuals
i <- 123 # individual 123
{
  par(mfrow = c(3, 1))
  lapply(1:3, function(x) {
      plot(unlist(lapply(1:5000, function(w)
            out$simulations$z[[w]][i, x])) , type = "l", ylab = "", col = "blue", ylim = c(-1, 1))
                 abline(h = genz[i, x], col = "red")
  })
  par(mfrow = c(1, 1))
}

# plot coordinates of traits
k <- 8
{
  par(mfrow = c(3, 1))
    lapply(1:3, function(x) {
        plot(unlist(lapply(1:5000, function(w)
              out$simulations$v[[w]][k, x])) , type = "l", ylab = "", col = "blue", ylim = c(-1, 1))
                  abline(h = genv[k, x], col = "red")
    })
  par(mfrow = c(1, 1))
}
## End(Not run) 

ahoundetoungan/PartialNetwork documentation built on Oct. 6, 2020, 1:51 a.m.