ergm | R Documentation |
ergm()
is used to fit exponential-family random graph
models (ERGMs), in which
the probability of a given network, y
, on a set of nodes is
h(y) \exp\{\eta(\theta) \cdot
g(y)\}/c(\theta)
, where
h(y)
is the reference measure (usually h(y)=1
),
g(y)
is a vector of network statistics for y
,
\eta(\theta)
is a natural parameter vector of the same
length (with \eta(\theta)=\theta
for most terms), and c(\theta)
is the
normalizing constant for the distribution.
ergm()
can return a maximum pseudo-likelihood
estimate, an approximate maximum likelihood estimate based on a Monte
Carlo scheme, or an approximate contrastive divergence estimate based
on a similar scheme.
(For an overview of the package \insertCiteHuHa08e,KrHu23eergm, see ergm.)
ergm(
formula,
response = NULL,
reference = ~Bernoulli,
constraints = ~.,
obs.constraints = ~. - observed,
offset.coef = NULL,
target.stats = NULL,
eval.loglik = getOption("ergm.eval.loglik"),
estimate = c("MLE", "MPLE", "CD"),
control = control.ergm(),
verbose = FALSE,
...,
basis = ergm.getnetwork(formula),
newnetwork = c("one", "all", "none")
)
is.ergm(object)
## S3 method for class 'ergm'
is.na(x)
## S3 method for class 'ergm'
anyNA(x, ...)
## S3 method for class 'ergm'
nobs(object, ...)
## S3 method for class 'ergm'
print(x, digits = max(3, getOption("digits") - 3), ...)
## S3 method for class 'ergm'
vcov(object, sources = c("all", "model", "estimation"), ...)
formula |
An R |
response |
Either a character string, a formula, or
|
reference |
A one-sided formula specifying
the reference measure ( |
constraints |
A formula specifying one or more constraints
on the support of the distribution of the networks being modeled. Multiple constraints
may be given, separated by “+” and “-” operators. See
The default is to have no constraints except those provided through
the Together with the model terms in the formula and the reference measure, the constraints define the distribution of networks being modeled. It is also possible to specify a proposal function directly either
by passing a string with the function's name (in which case,
arguments to the proposal should be specified through the
Note that not all possible combinations of constraints and reference measures are supported. However, for relatively simple constraints (i.e., those that simply permit or forbid specific dyads or sets of dyads from changing), arbitrary combinations should be possible. |
obs.constraints |
A one-sided formula specifying one or more
constraints or other modification in addition to those
specified by This allows the domain of the integral in the numerator of the partially obseved network face-value likelihoods of Handcock and Gile (2010) and Karwa et al. (2017) to be specified explicitly. The default is to constrain the integral to only integrate over
the missing dyads (if present), after incorporating constraints
provided through the It is also possible to specify a proposal function directly by
passing a string with the function's name of the |
offset.coef |
A vector of coefficients for the offset terms. |
target.stats |
vector of "observed network statistics,"
if these statistics are for some reason different than the
actual statistics of the network on the left-hand side of
|
eval.loglik |
Logical: For dyad-dependent models, if TRUE, use bridge
sampling to evaluate the log-likelihoood associated with the
fit. Has no effect for dyad-independent models.
Since bridge sampling takes additional time, setting to FALSE may
speed performance if likelihood values (and likelihood-based
values like AIC and BIC) are not needed. Can be set globally via |
estimate |
If "MPLE," then the maximum pseudolikelihood estimator
is returned. If "MLE" (the default), then an approximate maximum likelihood
estimator is returned. For certain models, the MPLE and MLE are equivalent,
in which case this argument is ignored. (To force MCMC-based approximate
likelihood calculation even when the MLE and MPLE are the same, see the
|
control |
A list of control parameters for algorithm tuning,
typically constructed with |
verbose |
A logical or an integer to control the amount of
progress and diagnostic information to be printed. |
... |
Additional arguments, to be passed to lower-level functions. |
basis |
a value (usually a |
newnetwork |
One of |
object |
an |
x , digits |
See |
sources |
For the |
ergm()
returns an object of ergm
that is a list
consisting of the following elements:
coef |
The Monte Carlo maximum likelihood estimate
of |
sample |
The |
sample.obs |
As |
iterations |
The number of Newton-Raphson iterations required before convergence. |
MCMCtheta |
The value of |
loglikelihood |
The approximate change in log-likelihood in the last iteration. The value is only approximate because it is estimated based on the MCMC random sample. |
gradient |
The value of the gradient vector of the approximated loglikelihood function, evaluated at the maximizer. This vector should be very close to zero. |
covar |
Approximate covariance matrix for the MLE, based on the inverse Hessian of the approximated loglikelihood evaluated at the maximizer. |
failure |
Logical: Did the MCMC estimation fail? |
network |
Network passed on the left-hand side of |
newnetworks |
If argument |
newnetwork |
If argument |
coef.init |
The initial value of |
est.cov |
The covariance matrix of the model statistics in the final MCMC sample. |
coef.hist , steplen.hist , stats.hist , stats.obs.hist |
For the MCMLE method, the history of coefficients, Hummel step lengths, and average model statistics for each iteration.. |
control |
The control list passed to the call. |
etamap |
The set of functions mapping the true parameter theta to the canonical parameter eta (irrelevant except in a curved exponential family model) |
formula |
The original |
target.stats |
The target.stats used during estimation (passed through from the Arguments) |
target.esteq |
Used for curved models to preserve the target mean values of the curved terms. It is identical to target.stats for non-curved models. |
constraints |
Constraints used during estimation (passed through from the Arguments) |
reference |
The reference measure used during estimation (passed through from the Arguments) |
estimate |
The estimation method used (passed through from the Arguments). |
offset |
vector of logical telling which model parameters are to be set at a fixed value (i.e., not estimated). |
drop |
If
|
estimable |
A logical vector indicating which terms could not be
estimated due to a |
info |
A list with miscellaneous information that would typically be accessed by the user via methods; in general, it should not be accessed directly. Current elements include:
|
null.lik |
Log-likelihood of the null model. Valid only for unconstrained models. |
mle.lik |
The approximate log-likelihood for the MLE. The value is only approximate because it is estimated based on the MCMC random sample. |
is.na(ergm)
: Return TRUE
if the ERGM was fit to a partially observed network and/or an observational process, such as missing (NA
) dyads.
anyNA(ergm)
: Alias to the is.na()
method.
nobs(ergm)
: Return the number of informative dyads of a model fit.
print(ergm)
: Print the call, the estimate, and the method used to obtain it.
vcov(ergm)
: extracts the variance-covariance matrix of
parameter estimates.
Although each of the statistics in a given model is a summary statistic for the entire network, it is rarely necessary to calculate statistics for an entire network in a proposed Metropolis-Hastings step. Thus, for example, if the triangle term is included in the model, a census of all triangles in the observed network is never taken; instead, only the change in the number of triangles is recorded for each edge toggle.
In the implementation of ergm()
, the model is
initialized in R, then all the model information is passed to a C
program that generates the sample of network statistics using MCMC.
This sample is then returned to R, which then uses one of several
algorithms, selected by main.method=
control.ergm()
parameter
to update the estimate.
The mechanism for proposing new networks for the MCMC sampling
scheme, which is a Metropolis-Hastings algorithm, depends on
two things: The constraints
, which define the set of possible
networks that could be proposed in a particular Markov chain step,
and the weights placed on these possible steps by the
proposal distribution. The former may be controlled using the
constraints
argument described above. The latter may
be controlled using the prop.weights
argument to the
control.ergm()
function.
The package is designed so that the user could conceivably add additional proposal types.
Admiraal R, Handcock MS (2007). networksis: Simulate bipartite graphs with fixed marginals through sequential importance sampling. Statnet Project, Seattle, WA. Version 1. https://statnet.org.
Bender-deMoll S, Morris M, Moody J (2008). Prototype Packages for Managing and Animating Longitudinal Network Data: dynamicnetwork and rSoNIA. Journal of Statistical Software, 24(7). \Sexpr[results=rd]{tools:::Rd_expr_doi("10.18637/jss.v024.i07")}
Butts CT (2007). sna: Tools for Social Network Analysis. R package version 2.3-2. https://cran.r-project.org/package=sna.
Butts CT (2008). network: A Package for Managing Relational Data in R. Journal of Statistical Software, 24(2). \Sexpr[results=rd]{tools:::Rd_expr_doi("10.18637/jss.v024.i02")}
Butts C (2015). network: The Statnet Project (https://statnet.org). R package version 1.12.0, https://cran.r-project.org/package=network.
Goodreau SM, Handcock MS, Hunter DR, Butts CT, Morris M (2008a). A statnet Tutorial. Journal of Statistical Software, 24(8). \Sexpr[results=rd]{tools:::Rd_expr_doi("10.18637/jss.v024.i08")}
Goodreau SM, Kitts J, Morris M (2008b). Birds of a Feather, or Friend of a Friend? Using Exponential Random Graph Models to Investigate Adolescent Social Networks. Demography, 45, in press.
Handcock, M. S. (2003) Assessing Degeneracy in Statistical Models of Social Networks, Working Paper #39, Center for Statistics and the Social Sciences, University of Washington. https://csss.uw.edu/research/working-papers/assessing-degeneracy-statistical-models-social-networks
Handcock MS (2003b). degreenet: Models for Skewed Count Distributions Relevant to Networks. Statnet Project, Seattle, WA. Version 1.0, https://statnet.org.
Handcock MS and Gile KJ (2010). Modeling Social Networks from Sampled Data. Annals of Applied Statistics, 4(1), 5-25. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1214/08-AOAS221")}
Handcock MS, Hunter DR, Butts CT, Goodreau SM, Morris M (2003a). ergm: A Package to Fit, Simulate and Diagnose Exponential-Family Models for Networks. Statnet Project, Seattle, WA. Version 2, https://statnet.org.
Handcock MS, Hunter DR, Butts CT, Goodreau SM, Morris M (2003b). statnet: Software Tools for the Statistical Modeling of Network Data. Statnet Project, Seattle, WA. Version 2, https://statnet.org.
Hunter, D. R. and Handcock, M. S. (2006) Inference in curved exponential family models for networks, Journal of Computational and Graphical Statistics.
Hunter DR, Handcock MS, Butts CT, Goodreau SM, Morris M (2008b). ergm: A Package to Fit, Simulate and Diagnose Exponential-Family Models for Networks. Journal of Statistical Software, 24(3). \Sexpr[results=rd]{tools:::Rd_expr_doi("10.18637/jss.v024.i03")}
Karwa V, Krivitsky PN, and Slavkovi\'c AB (2017). Sharing Social Network Data: Differentially Private Estimation of Exponential-Family Random Graph Models. Journal of the Royal Statistical Society, Series C, 66(3):481–500. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1111/rssc.12185")}
Krivitsky PN (2012). Exponential-Family Random Graph Models for Valued Networks. Electronic Journal of Statistics, 2012, 6, 1100-1128. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1214/12-EJS696")}
Morris M, Handcock MS, Hunter DR (2008). Specification of Exponential-Family Random Graph Models: Terms and Computational Aspects. Journal of Statistical Software, 24(4). \Sexpr[results=rd]{tools:::Rd_expr_doi("10.18637/jss.v024.i04")}
Snijders, T.A.B. (2002), Markov Chain Monte Carlo Estimation of Exponential Random Graph Models. Journal of Social Structure. Available from https://www.cmu.edu/joss/content/articles/volume3/Snijders.pdf.
network
, %v%
, %n%
, ergmTerm
, ergmMPLE
,
summary.ergm()
#
# load the Florentine marriage data matrix
#
data(flo)
#
# attach the sociomatrix for the Florentine marriage data
# This is not yet a network object.
#
flo
#
# Create a network object out of the adjacency matrix
#
flomarriage <- network(flo,directed=FALSE)
flomarriage
#
# print out the sociomatrix for the Florentine marriage data
#
flomarriage[,]
#
# create a vector indicating the wealth of each family (in thousands of lira)
# and add it as a covariate to the network object
#
flomarriage %v% "wealth" <- c(10,36,27,146,55,44,20,8,42,103,48,49,10,48,32,3)
flomarriage
#
# create a plot of the social network
#
plot(flomarriage)
#
# now make the vertex size proportional to their wealth
#
plot(flomarriage, vertex.cex=flomarriage %v% "wealth" / 20, main="Marriage Ties")
#
# Use 'data(package = "ergm")' to list the data sets in a
#
data(package="ergm")
#
# Load a network object of the Florentine data
#
data(florentine)
#
# Fit a model where the propensity to form ties between
# families depends on the absolute difference in wealth
#
gest <- ergm(flomarriage ~ edges + absdiff("wealth"))
summary(gest)
#
# add terms for the propensity to form 2-stars and triangles
# of families
#
gest <- ergm(flomarriage ~ kstar(1:2) + absdiff("wealth") + triangle)
summary(gest)
# import synthetic network that looks like a molecule
data(molecule)
# Add a attribute to it to mimic the atomic type
molecule %v% "atomic type" <- c(1,1,1,1,1,1,2,2,2,2,2,2,2,3,3,3,3,3,3,3)
#
# create a plot of the social network
# colored by atomic type
#
plot(molecule, vertex.col="atomic type",vertex.cex=3)
# measure tendency to match within each atomic type
gest <- ergm(molecule ~ edges + kstar(2) + triangle + nodematch("atomic type"))
summary(gest)
# compare it to differential homophily by atomic type
gest <- ergm(molecule ~ edges + kstar(2) + triangle
+ nodematch("atomic type",diff=TRUE))
summary(gest)
# Extract parameter estimates as a numeric vector:
coef(gest)
# Sources of variation in parameter estimates:
vcov(gest, sources="model")
vcov(gest, sources="estimation")
vcov(gest, sources="all") # the default
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.