```{css, echo=FALSE} pre { font-size: 85% }
```r library(knitr) library(rmarkdown) options(rmarkdown.html_vignette.check_title = FALSE) opts_chunk$set(message=FALSE, echo=TRUE, cache=TRUE, autodep=TRUE, concordance=TRUE, error=FALSE, fig.width=7, fig.height=7) options(width=160)
library(ergm.multi) library(dplyr) library(purrr) library(tibble) library(ggplot2)
The list of networks studied by @GoSa18h is included in this package:
data(Goeyvaerts) length(Goeyvaerts)
An explanation of the networks, including a list of their
network (%n%
) and vertex (%v%
) attributes, can be obtained via ?Goeyvaerts
.
A total of r length(Goeyvaerts)
complete networks were collected, then two were excluded due to "nonstandard" family composition:
Goeyvaerts %>% discard(`%n%`, "included") %>% map(as_tibble, unit="vertices")
To reproduce the analysis, exclude them as well:
G <- Goeyvaerts %>% keep(`%n%`, "included")
Obtain weekday indicator, network size, and density for each network, and summarize them as in @GoSa18h Table 1:
G %>% map(~list(weekday = . %n% "weekday", n = network.size(.), d = network.density(.))) %>% bind_rows() %>% group_by(weekday, n = cut(n, c(1,2,3,4,5,9))) %>% summarize(nnets = n(), p1 = mean(d==1), m = mean(d)) %>% kable()
We now reproduce the ERGM fits. First, we extract the weekday networks:
G.wd <- G %>% keep(`%n%`, "weekday") length(G.wd)
Next, we specify the multi-network model using the N(formula, lm)
operator. This operator will evaluate the ergm
formula formula
on each network, weighted by the predictors passed in the one-sided lm
formula, which is interpreted the same way as that passed to the built-in lm()
function, with its "data
" being the table of network attributes.
Since different networks may have different compositions, to have a consistent model, we specify a consistent list of family roles.
roleset <- sort(unique(unlist(lapply(G.wd, `%v%`, "role"))))
We now construct the formula object, which will be passed directly to ergm()
:
# Networks() function tells ergm() to model these networks jointly. f.wd <- Networks(G.wd) ~ # This N() operator adds three edge counts: N(~edges, ~ # one total for all networks (intercept implicit as in lm), I(n<=3)+ # one total for only small households, and I(n>=5) # one total for only large households. ) + # This N() construct evaluates each of its terms on each network, # then sums each statistic over the networks: N( # First, mixing statistics among household roles, including only # father-mother, father-child, and mother-child counts. # Since tail < head in an undirected network, in the # levels2 specification, it is important that tail levels (rows) # come before head levels (columns). In this case, since # "Child" < "Father" < "Mother" in alphabetical order, the # row= and col= categories must be sorted accordingly. ~mm("role", levels = I(roleset), levels2=~.%in%list(list(row="Father",col="Mother"), list(row="Child",col="Father"), list(row="Child",col="Mother"))) + # Second, the nodal covariate effect of age, but only for # edges between children. F(~nodecov("age"), ~nodematch("role", levels=I("Child"))) + # Third, 2-stars. kstar(2) ) + # This N() adds one triangle count, totalled over all households # with at least 6 members. N(~triangles, ~I(n>=6))
See ergmTerm?mm
for documentation on the mm
term used above.
Now, we can fit the model:
# (Set seed for predictable run time.) fit.wd <- ergm(f.wd, control=snctrl(seed=123))
summary(fit.wd)
Similarly, we can extract the weekend network, and fit it to a smaller model. We only need one N()
operator, since all statistics are applied to the same set
of networks, namely, all of them.
G.we <- G %>% discard(`%n%`, "weekday") fit.we <- ergm(Networks(G.we) ~ N(~edges + mm("role", levels=I(roleset), levels2=~.%in%list(list(row="Father",col="Mother"), list(row="Child",col="Father"), list(row="Child",col="Mother"))) + F(~nodecov("age"), ~nodematch("role", levels=I("Child"))) + kstar(2) + triangles), control=snctrl(seed=123))
summary(fit.we)
Perform diagnostic simulation [@KrCo22t], summarize the residuals, and make residuals vs. fitted and scale-location plots:
gof.wd <- gofN(fit.wd, GOF = ~ edges + kstar(2) + triangles) summary(gof.wd)
Variances of Pearson residuals substantially greater than 1 suggest unaccounted-for heterogeneity.
autoplot(gof.wd)
The plots don't look unreasonable.
Also make plots of residuals vs. square root of fitted and vs. network size:
autoplot(gof.wd, against=sqrt(.fitted)) autoplot(gof.wd, against=ordered(n))
It looks like network-size effects are probably accounted for.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.