lologVariational: Fits a latent ordered network model using Monte Carlo...

Description Usage Arguments Details Value References Examples

View source: R/lolog-variational.R

Description

Fits a latent ordered network model using Monte Carlo variational inference

Usage

1
2
3
4
5
6
7
lologVariational(
  formula,
  nReplicates = 5L,
  dyadInclusionRate = NULL,
  edgeInclusionRate = NULL,
  targetFrameSize = 5e+05
)

Arguments

formula

A lolog formula. See link{lolog}

nReplicates

An integer controlling how many dyad ordering to perform.

dyadInclusionRate

Controls what proportion of non-edges in each ordering should be dropped.

edgeInclusionRate

Controls what proportion of edges in each ordering should be dropped.

targetFrameSize

Sets dyadInclusionRate so that the model frame for the logistic regression will have on average this amount of observations.

Details

This function approximates the maximum likelihood solution via a variational inference on the graph (y) over the latent edge variable inclusion order (s). Specifically, it replaces the conditional probability p(s | y) by p(s). If the LOLOG model contains only dyad independent terms, then these two probabilities are identical, and thus variational inference is exactly maximum likelihood inference. The objective function is

E_{p(s)}\bigg(\log p(y| S, θ) \bigg)

This can be approximated by drawing samples from p(s) to approximate the expectation. The number of samples is controlled by the nReplicates parameter. The memory required is on the order of nReplicates * (# of dyads). For large networks this can be impractical, so adjusting dyadInclusionRate and edgeInclusionRate allows one to down sample the # of dyads in each replicate. By default these are set attempting to achieve as equal a number of edges and non-edges as possible while targeting a model frame with targetFrameSize number of rows.

If the model is dyad independent, replicates are redundant, and so nReplicates is set to 1 with a note.

The functional form of the objective function is equivalent to logistic regression, and so the glm function is used to maximize it. The asymptotic covariance of the parameter estimates is calculated using the methods of Westling (2015).

Value

An object of class c('lologVariationalFit','lolog','list') consisting of the following items:

formula

The model formula

method

"variational"

theta

The fit parameter values

vcov

The asymptotic covariance matrix for the parameter values.

nReplicates

The number of replicates

dyadInclusionRate

The rate at which non-edges are included

edgeInclusionRate

The rate at which edges are included

allDyadIndependent

Logical indicating model dyad independence

likelihoodModel

An object of class *LatentOrderLikelihood at the fit parameters

outcome

The outcome vector for the logistic regression

predictors

The change statistic predictor matrix for the logistic regression

References

Westling, T., & McCormick, T. H. (2015). Beyond prediction: A framework for inference with variational approximations in mixture models. arXiv preprint arXiv:1510.08151.

Examples

1
2
3
4
5
6
7
8
9
library(network)
data(ukFaculty)

# Delete vertices missing group
delete.vertices(ukFaculty, which(is.na(ukFaculty %v% "Group")))

fit <- lologVariational(ukFaculty ~ edges() + nodeMatch("GroupC"),
                       nReplicates=1L, dyadInclusionRate=1)
summary(fit)

lolog documentation built on July 1, 2021, 9:09 a.m.