multiNet: Latent Space Models for Multivariate Networks
In spaceNet: Latent Space Models for Multidimensional Networks

Description Usage Arguments Details Value References See Also Examples

View source: R/multiNet.R

Implements latent space models for multivariate networks (multiplex) via MCMC algorithm.

multiNet(Y, niter = 1000, D = 2,
         muA = 0, tauA = NULL, nuA = 3,
         muB = 0, tauB = NULL, nuB = 3,
         muL = 0, tauL = NULL, nuL = 3,
         alphaRef = NULL,
         sender = c("const", "var"),
         receiver = c("const", "var"),
         covariates = NULL,
         DIC = FALSE, WAIC = FALSE,
         burnIn = round(niter*0.3),
         trace = TRUE,
         allChains = FALSE,
         refSpace = NULL)

`Y`	A three-dimensional array or list of n x n adjacency matrices composing the multidimensional network. A list will be converted to an array. If an array, the dimension of `Y` must be `(n,n,K)`, where `n` is the number of nodes and `K` the number of networks. Missing values (`NA`) are allowed; see details.
`niter`	The number of MCMC iterations. The default value is `niter = 1000`.
`D`	The dimension of the latent space, with `D > 1`. The default value is `D = 2`.
`muA, muB, muL`	Mean hyperparameters, see details.
`tauA, tauB, tauL`	Mean hyperparameters, see details.
`nuA, nuB, nuL`	Variance hyperparameters, see details.
`alphaRef`	The value for the intercept in the first network (the reference network). This value can be specified by the user on the basis of prior knowledge. By default is computed using the function `alphaRef`, see details.
`sender, receiver`	The type of node-specific sender and receiver effects to be included in the model. If specified, these effects can be set to constant (`"const"`) or/and variable (`"var"`). By default, node-specific effects are not included in the model (`NULL`).
`covariates`	An array or a list with edge-covariates matrices. A list is automatically converted to an array. Covariates can be either continuous or discrete and must be constant throughout the views of the multiplex. The dimension of `covariates` is `(n,n,L)`, where `n` is the number of nodes and `L` the number of covariates, that is, the number of covariates matrices. Missing values (`NA`) are not allowed.
`DIC`	A logical value indicating wether the DIC (Deviance Information Criterion) should be computed. The default is `DIC = FALSE`.
`WAIC`	A logical value indicating wether the WAIC (Widely Available Information Criterion) should be computed. The default is `WAIC = FALSE`.
`burnIn`	A numerical value, the number of iterations of the chain to be discarded when computing the posterior estimates. The default value is `burnIn = round(niter*0.3)`.
`trace`	A logical value indicating if a progress bar should be printed.
`allChains`	A logical value indicating if the full parameter chains should also be returned in output. The default value is `allChains = FALSE`.
`refSpace`	Optional. A matrix containing a set of reference values for the latent coordinates of the nodes. Its dimension must be `(n, D)`, where `n` is the number of nodes and `D` the number of dimensions of the latent space. The coordinates stored in the matrix `refSpace` are compared with the estimated ones at each iteration via Procrustes correlation. High values of the correlation index indicate that the estimated coordinates are a translation and/or a rotation of the coordinates in `refSpace`.

The function estimates a latent space model for multidimensional networks (multiplex) via MCMC. The model assumes that the probability of observing an arc between any two nodes is inversely related to their distance in a low-dimensional latent space. Hence, nodes close in the latent space have a higher probability of being connected across the views of the multiplex than nodes far apart. The model allows the inclusion of node-specific sender and receiver effects and edge-specific covariates.

The probability of an edge beteween nodes i and j in the k_{th} network is defined as:

P ( y_{ijk} = 1 | Ω_k , d_{ij} , λ ) = C_{ijk} \ ( 1 + C_{ijk} ).

with C_{ijk} = exp( α_k - β_k * d_{ij} - λ * x_{ij} ) when node-specific effects are not present and C_{ijk} = exp( α_k φ_{ijk} - β_k * d_{ij} - λ * x_{ij} ) when they are included in the model.

The arguments of C_{ijk} are:

The squared Euclidean distance between nodes i and j in the latent space, d_{ij}.
A coefficient λ to scale the edge-specific covariate x_{ij}. If more than one covariate is introduced in the model, their sum is considered, with each covariate being rescaled by a specific coefficient λ_l. Edge-specific covariates are assumed to be inversely related to edge probabilities, hence λ => 0 .
A vector of network-specific parameters, Ω_k = ( α_k, β_k ). These parameters are:
- A rescaling coefficient β_k, which weights the importance of the latent space in the k_{th} network, with β_k => 0. In the first network (that is the reference network), the coefficient is fixed to β_1 = 1 for identifiability reasons.
- An intercept parameter α_k, which corresponds to the largest edge probability allowed in the k_{th} network. Indeed, when β_k = 0 and when no covariate is included, the probability of having a link between a couple of nodes is that of the random graph:
  
  P ( y_{ijk} = 1 | α_k ) = exp( α_k ) \ ( 1 + exp( α_k ) ).
  
  The intercepts have a lower bound corresponding to log ( log( n ) \ ( n - log( n ) ) ). For identifiability reasons, the intercept of the first network needs to be fixed. Its value can be either specified by the user on the basis of prior knowledge or computed with the function alphaRef.
When node-specific effects are included in the model,

φ_{ijk} = g*(θ_{ik} + γ_{jk})

with :
- θ_{ik} the sender effect of node i in network k.
- γ_{k} the receiver effect of node j in network k.
- g a scalar. When both sender and receiver effects are present, g=0.5; when only one type of effect is included in the model, g=1.
When the sender and/or receiver effects are set to constant ("const"), each node i is assumed to have a constant effect across the different networks: θ_{ik} = θ_{i} and/or γ_{ik} = γ_{i}. Instead, when they are set to variable ("var"), each node has a different effect across the networks: θ_{ik} and/or γ_{ik}.

Inference on the model parameters is carried out via a MCMC algorithm. A hierarchical framework is adopted for estimation, where the parameters of the distributions of α, β and λ are considered nuisance parameters and assumed to follow hyper-prior distributions. The parameters of these hyperpriors need to be fixed and are the following:

tauA, tauB and tauL are the scale factors for the variances of the hyperprior distributions for the mean parameters of α_k , β_k and λ_l. If not specified by the user, tauA and tauB are computed as ( K - 1 ) \ K , if K > 1, otherwise they are set to 0.5. Parameter tauL is calculated as ( L - 1 ) \ K , if L > 1, otherwise it is set to 0.5.
muA, muB and muL are the means of the hyperprior distributions for the mean parameters of α_k , β_k and λ_l. If not specified by the user, they are all set to 0.
nuA, nuB and nuL are the degrees of freedom of the hyperprior distributions for the variance parameters of α_k , β_k and λ_l. If not specified by the user, they are all set to 3.

Missing data are considered structural and correspond to edges missing because one or more nodes are not observable in some of the networks of the multiplex. No imputation is performed, instead, the term corresponding to the missing edge is discarded in the computation of the likelihood function. For example, if either node i or j is not observable in network k, the edge (i,j) is missing and the likelihood function for network k is calculated discarding the corresponding (i,j) term. Notice that the model assumes a single common generative latent space for the whole multidimensional network. Thus, discarding the (i,j) term in the k_{th} network does not prevent from recovering the coordinates of nodes i and j in the latent space.

An object of class 'multiNet' containing the following components:

`n`	The number of nodes in the multidimensional network.
`K`	The number of networks in the multidimensional network.
`D`	The number of dimensions of the estimated latent space.
`parameters`	A list with the following components: `alpha` is a list with two components: the means of the posterior distributions and the standard deviations of the posterior distributions for the intercept parameters; `beta` is a list with two components: the means of the posterior distributions and the standard deviations of the posterior distributions for the latent space coefficient parameters; `theta` is a list with two components: the means of the posterior distributions and the standard deviations of the posterior distributions for the sender effect parameters; `gamma` is a list with two components: the means of the posterior distributions and the standard deviations of the posterior distributions for the receiver effect parameters; `lambda` is a list with two components: the means of the posterior distributions and the standard deviations of the posterior distributions for the covariate coefficient parameters.
`latPos`	A list with posterior estimates of means and standard deviations of the latent coordinates.
`accRates`	A list with the following components: `alpha` is a vector with the acceptance rates for the intercept parameters; `beta` is a vector with the acceptance rates for the latent space coefficient parameters; `theta` is a matrix with the acceptance rates for the sender effect parameters; `gamma` is a matrix with the acceptance rates for the receiver effect parameters; `lambda` is a vector with the acceptance rates for the covariate coefficient parameters; `latPos` is a vector with the acceptance rates for the latent coordinates of the nodes.
`DIC`	The Deviance Information Criterion of the estimated model. Computed only if `DIC = TRUE` in input.
`WAIC`	The Widely Available Information Criterion of the estimated model. Computed only if `WAIC = TRUE` in input.
`allChains`	If `allChains = TRUE`, a list with the following components is returned: `parameters` is a list with the estimated posterior distributions of the model parameters: α, β, θ, γ and λ; `latPos` is an array with the posterior distributions of the latent coordinates of each node; `priorParameters` is a list with the estimated posterior distributions of the parameters of the prior distributions of α, β and λ.
`corrRefSpace`	A numerical vector containing the values of the Procrustes correlation between the reference space and the estimated one, computed at each mcmc iteration. Only outputed when `refSpace` is given, otherwise `NULL`.
`info`	A list with some information on the estimated model: `call` contains the function call; `niter` is the number of MCMC iterations; `burnIn` is the number of initial iterations to discarded when computing the estimates; `sender` is the node-specific sender effect type; `receiver` is the node-specific receiver effect type; `covariates` is the covariates array, if present; `L` is the number of covariates.

D'Angelo, S. and Murphy, T. B. and Alf<c3><b2>, M. (2018). Latent space modeling of multidimensional networks with application to the exchange of votes in the Eurovision Song Contest. arXiv.

D'Angelo, S. and Alf<c3><b2>, M. and Murphy, T. B. (2018). Node-specific effects in latent space modelling of multidimensional networks. arXiv.

alphaRef

data(vickers)

it <- 10     # small number of iterations just for example

# 2-dimensional latent space model, no covariates
mod <- multiNet(vickers, niter = it, D = 2)

# 2-dimensional latent space model, sex as covariate
mod <- multiNet(vickers, niter = it, D = 2,
                covariates = sex)

# 2-dimensional latent space model, with constant sender
# effect and variable receiver effect
mod <- multiNet(vickers, niter = it, D = 2,
                sender = "const", receiver = "var")