specify_priors: Specify prior hyperparameters for EM algorithm

View source: R/specify_priors.R

specify_priorsR Documentation

Specify prior hyperparameters for EM algorithm

Description

A function that allows the user to specify the prior hyperparameters for the EM algorithm in a structure accepted by JANE.

Usage

specify_priors(
  D = 2,
  K = 2,
  model,
  family = "bernoulli",
  noise_weights = FALSE,
  n_interior_knots = NULL,
  a,
  b,
  c,
  G,
  nu,
  e,
  f,
  h,
  l,
  e_2,
  f_2,
  m_1,
  o_1,
  m_2,
  o_2
)

Arguments

D

An integer specifying the dimension of the latent positions (default is 2).

K

An integer specifying the total number of clusters (default is 2).

model

A character string specifying the model:

  • 'NDH': undirected network with no degree heterogeneity (or connection strength heterogeneity if working with weighted network)

  • 'RS': undirected network with degree heterogeneity (and connection strength heterogeneity if working with weighted network)

  • 'RSR': directed network with degree heterogeneity (and connection strength heterogeneity if working with weighted network)

family

A character string specifying the distribution of the edge weights.

  • 'bernoulli': for unweighted networks; utilizes a Bernoulli distribution with a logit link (default)

  • 'lognormal': for weighted networks with positive, non-zero, continuous edge weights; utilizes a log-normal distribution with an identity link

  • 'poisson': for weighted networks with edge weights representing non-zero counts; utilizes a zero-truncated Poisson distribution with a log link

noise_weights

A logical; if TRUE then a Hurdle model is used to account for noise weights, if FALSE simply utilizes the supplied network (converted to an unweighted binary network if a weighted network is supplied, i.e., (A > 0.0)*1.0) and fits a latent space cluster model (default is FALSE).

n_interior_knots

An integer specifying the number of interior knots used in fitting a natural cubic spline for degree heterogeneity (and connection strength heterogeneity if working with weighted network) models (i.e., 'RS' and 'RSR' only; default is NULL).

a

A numeric vector of length D specifying the mean of the multivariate normal prior on \mu_k for k = 1,\ldots,K, where \mu_k represents the mean of the multivariate normal distribution for the latent positions of the k^{th} cluster.

b

A positive numeric scalar specifying the scaling factor on the precision of the multivariate normal prior on \mu_k for k = 1,\ldots,K, where \mu_k represents the mean of the multivariate normal distribution for the latent positions of the k^{th} cluster.

c

A numeric scalar \ge D specifying the degrees of freedom of the Wishart prior on \Omega_k for k = 1,\ldots,K, where \Omega_k represents the precision of the multivariate normal distribution for the latent positions of the k^{th} cluster.

G

A numeric D \times D matrix specifying the inverse of the scale matrix of the Wishart prior on \Omega_k for k = 1,\ldots,K, where \Omega_k represents the precision of the multivariate normal distribution for the latent positions of the k^{th} cluster.

nu

A positive numeric vector of length K specifying the concentration parameters of the Dirichlet prior on p, where p represents the mixture weights of the finite multivariate normal mixture distribution for the latent positions.

e

A numeric vector of length 1 + (model =='RS')*(n_interior_knots + 1) + (model =='RSR')*2*(n_interior_knots + 1) specifying the mean of the multivariate normal prior on \beta_{LR}, where \beta_{LR} represents the coefficients of the logistic regression model.

f

A numeric p.s.d square matrix of dimension 1 + (model =='RS')*(n_interior_knots + 1) + (model =='RSR')*2*(n_interior_knots + 1) specifying the precision of the multivariate normal prior on \beta_{LR}, where \beta_{LR} represents the coefficients of the logistic regression model.

h

A positive numeric scalar specifying the first shape parameter for the Beta prior on q, where q is the proportion of non-edges in the "true" underlying network converted to noise edges. Only relevant when noise_weights = TRUE.

l

A positive numeric scalar specifying the second shape parameter for the Beta prior on q, where q is the proportion of non-edges in the "true" underlying network converted to noise edges. Only relevant when noise_weights = TRUE.

e_2

A numeric vector of length 1 + (model =='RS')*(n_interior_knots + 1) + (model =='RSR')*2*(n_interior_knots + 1) specifying the mean of the multivariate normal prior on \beta_{GLM}, where \beta_{GLM} represents the coefficients of the zero-truncated Poisson or log-normal GLM. Only relevant when noise_weights = TRUE & family != 'bernoulli'.

f_2

A numeric p.s.d square matrix of dimension 1 + (model =='RS')*(n_interior_knots + 1) + (model =='RSR')*2*(n_interior_knots + 1) specifying the precision of the multivariate normal prior on \beta_{GLM}, where \beta_{GLM} represents the coefficients of the zero-truncated Poisson or log-normal GLM. Only relevant when noise_weights = TRUE & family != 'bernoulli'.

m_1

A positive numeric scalar specifying the shape parameter for the Gamma prior on \tau^2_{weights}, where \tau^2_{weights} is the precision (on the log scale) of the log-normal weight distribution. Note, this value is scaled by 0.5, see 'Details'. Only relevant when noise_weights = TRUE & family = 'lognormal'.

o_1

A positive numeric scalar specifying the rate parameter for the Gamma prior on \tau^2_{weights}, where \tau^2_{weights} is the precision (on the log scale) of the log-normal weight distribution. Note, this value is scaled by 0.5, see 'Details'. Only relevant when noise_weights = TRUE & family = 'lognormal'.

m_2

A positive numeric scalar specifying the shape parameter for the Gamma prior on \tau^2_{noise \ weights}, where \tau^2_{noise \ weights} is the precision (on the log scale) of the log-normal noise weight distribution. Note, this value is scaled by 0.5, see 'Details'. Only relevant when noise_weights = TRUE & family = 'lognormal'.

o_2

A positive numeric scalar specifying the rate parameter for the Gamma prior on \tau^2_{noise \ weights}, where \tau^2_{noise \ weights} is the precision (on the log scale) of the log-normal noise weight distribution. Note, this value is scaled by 0.5, see 'Details'. Only relevant when noise_weights = TRUE & family = 'lognormal'.

Details

Prior on \boldsymbol{\mu}_k and \boldsymbol{\Omega}_k (note: the same prior is used for k = 1,\ldots,K) :

\boldsymbol{\Omega}_k \sim Wishart(c, \boldsymbol{G}^{-1})

\boldsymbol{\mu}_k | \boldsymbol{\Omega}_k \sim MVN(\boldsymbol{a}, (b\boldsymbol{\Omega}_k)^{-1})

Prior on \boldsymbol{p}:

For the current implementation we require that all elements of the nu vector be \ge 1 to prevent against negative mixture weights for empty clusters.

\boldsymbol{p} \sim Dirichlet(\nu_1 ,\ldots,\nu_K)

Prior on \boldsymbol{\beta}_{LR}:

\boldsymbol{\beta}_{LR} \sim MVN(\boldsymbol{e}, \boldsymbol{F}^{-1})

Prior on q:

q \sim Beta(h, l)

Zero-truncated Poisson

Prior on \boldsymbol{\beta}_{GLM}:

\boldsymbol{\beta}_{GLM} \sim MVN(\boldsymbol{e}_{2}, \boldsymbol{F}_{2}^{-1})

Log-normal

Prior on \tau^2_{weights}:

\tau^2_{weights} \sim Gamma(\frac{m_1}{2}, \frac{o_1}{2})

Prior on \boldsymbol{\beta}_{GLM}:

\boldsymbol{\beta}_{GLM}|\tau^2_{weights} \sim MVN(\boldsymbol{e}_{2}, (\tau^2_{weights}\boldsymbol{F}_{2})^{-1})

Prior on \tau^2_{noise \ weights}:

\tau^2_{noise \ weights} \sim Gamma(\frac{m_2}{2}, \frac{o_2}{2})

Unevaluated calls can be supplied as values for specific hyperparameters. This is particularly useful when running JANE for multiple combinations of K and D. See 'examples' section below for implementation examples.

Value

A list of S3 class "JANE.priors" representing prior hyperparameters for the EM algorithm, in a structure accepted by JANE.

Examples


# Simulate network
mus <- matrix(c(-1,-1,1,-1,1,1), 
              nrow = 3,
              ncol = 2, 
              byrow = TRUE)
omegas <- array(c(diag(rep(7,2)),
                  diag(rep(7,2)), 
                  diag(rep(7,2))), 
                  dim = c(2,2,3))
p <- rep(1/3, 3)
beta0 <- 1.0
sim_data <- JANE::sim_A(N = 100L, 
                        model = "RS",
                        mus = mus, 
                        omegas = omegas, 
                        p = p, 
                        params_LR = list(beta0 = beta0), 
                        remove_isolates = TRUE)
                        
                        
# Specify prior hyperparameters
D <- 3L
K <- 5L
n_interior_knots <- 5L

a <- rep(1, D)
b <- 3
c <- 4
G <- 10*diag(D)
nu <- rep(2, K)
e <- rep(0.5, 1 + (n_interior_knots + 1))
f <- diag(c(0.1, rep(0.5, n_interior_knots + 1)))

my_prior_hyperparameters <- specify_priors(D = D,
                                           K = K,
                                           model = "RS",
                                           n_interior_knots = n_interior_knots,
                                           a = a,
                                           b = b,
                                           c = c,
                                           G = G,
                                           nu = nu,
                                           e = e,
                                           f = f)
                                           
# Run JANE on simulated data using supplied prior hyperparameters
res <- JANE::JANE(A = sim_data$A,
                  D = D,
                  K = K,
                  initialization = "GNN",
                  model = "RS",
                  case_control = FALSE,
                  DA_type = "none",
                  control = list(priors = my_prior_hyperparameters))

# Specify prior hyperparameters as unevaluated calls
n_interior_knots <- 5L
e <- rep(0.5, 1 + (n_interior_knots + 1))
f <- diag(c(0.1, rep(0.5, n_interior_knots + 1)))

my_prior_hyperparameters <- specify_priors(model = "RS",
                                           n_interior_knots = n_interior_knots,
                                           a = quote(rep(1, D)),
                                           b = b,
                                           c = quote(D + 1),
                                           G = quote(10*diag(D)),
                                           nu = quote(rep(2, K)),
                                           e = e,
                                           f = f)
                                           
# # Run JANE on simulated data using supplied prior hyperparameters (NOT RUN)
# future::plan(future::multisession, workers = 5)
# res <- JANE::JANE(A = sim_data$A,
#                    D = 2:5,
#                    K = 2:10,
#                    initialization = "GNN",
#                    model = "RS",
#                    case_control = FALSE,
#                    DA_type = "none",
#                    control = list(priors = my_prior_hyperparameters))
# future::plan(future::sequential)
                
                                                         


JANE documentation built on Aug. 12, 2025, 1:08 a.m.