Description Usage Arguments Details Value Author(s) References See Also Examples
The function sienaGOF
assesses goodness of fit
for a model specification as represented by an estimated
sienaFit
object created by siena07
.
This is done by simulations of auxiliary
statistics, that differ from the statistics used for estimating
the parameters. The auxiliary statistics must be given explicitly.
The fit is good if the average values
of the auxiliary statistics over many simulation runs are close
to the values observed in the data.
A Monte Carlo test based on the Mahalanobis
distance is used to calculate frequentist pvalues.
Plotting functions can be used to diagnose bad fit. There are basic
functions for calculating auxiliary statistics available out of the
box, and the user is also permitted to create custom functions.
1 2 3 4 5 6 7 8 9  sienaGOF(sienaFitObject, auxiliaryFunction,
period=NULL, verbose=FALSE, join=TRUE, twoTailed=FALSE,
cluster=NULL, robust=FALSE, groupName="Data1",
varName, ...)
## S3 method for class 'sienaGOF'
plot(x, center=FALSE, scale=FALSE, violin=TRUE, key=NULL,
perc=.05, period=1, ...)
descriptives.sienaGOF(x, center=FALSE, scale=FALSE, perc=.05, key=NULL,
period=1, showAll=FALSE)

sienaFitObject 
Results of class 
auxiliaryFunction 
Function to be used to calculate
the auxiliary statistics;
this can be a userdefined function, e.g. depending on the See 
period 
Vector of period(s) to be used (may run from 1 to
number of waves  1). Has an effect only if 
verbose 
Whether to print intermediate results. This may give some peace of mind to the user because calculations can take some time. 
join 
Boolean: should sienaGOF do tests on all of the periods
individually ( 
twoTailed 
Whether to use two tails for calculating pvalues on the Monte Carlo test. Recommended for advanced users only, as it is probably only applicable in rare cases. 
cluster 
Optionally, a 
robust 
Whether to use robust estimation of the covariance matrix. 
groupName 
Name of group; relevant for multigroup data sets. 
varName 
Name of dependent variable. 
x 
Result from a call to sienaGOF. 
center 
Whether to center the statistics by median during plotting. 
scale 
Whether to scale the statistics by range during plotting. scale=TRUE makes little sense without also center=TRUE. 
violin 
Use violin plots (vs. box plots only)? 
key 
Keys in the plot for the levels of the auxiliary statistic
(as given by parameter 
perc 
1 minus confidence level for the confidence bands (two sided). 
... 
Other arguments; e.g. (1), 
showAll 
If FALSE, drops statistics with variance 0, like in the plot. 
This function is used to assess the goodness of fit of an estimated stochastic actororiented model for an arbitrarily defined multidimensional auxiliary statistic. It operates basically by comparing the observed values, at the ends of the periods, with the simulated values for the ends of the periods. The differences are assessed by combining the components of the auxiliary statistic using the Mahalanobis distance.
The function does not work properly for data sets that include a
sienaCompositionChange
object. If you wish to test the
fit for such a data set, you need (for the purpose of fit assessment only)
to replace the data set by a data set where absent actors are represented
by structural zeros, and estimate the same model for this data set with the
corresponding effects object, and use sienaGOF
for this
sienaFit
object.
To achieve comparability between simulated and observed dependent variables,
variables that are missing in the data at the start or end of a period
are replaced by 0 (for tie variables) or NA (for behavior variables).
If there are any differences between structural values at the beginning and
at the end of a period, these are dealt with as follows.
For tie variables that have a structural value at the start of the period,
this value is used to replace the observed value at the end of the period
(for the goodness of fit assessment only).
For tie variables that have a structural value at the end of the period but a
free value value at the start of the period, the reference value for the
simulated values is lacking; therefore, the simulated values at the end of
the period then are replaced by the structural value at the end of the period
(again, for the goodness of fit assessment only).
The auxiliary statistics documented in sienaGOFauxiliary
are calculated for the simulated dependent
variables in Phase 3 of the estimation algorithm,
returned in sienaFitObject
because of having used
returnDeps = TRUE
in the call to siena07
.
These statistics should be chosen to represent features of the network
that are not explicitly fit by the estimation procedure but can be
considered important properties that the model at hand should represent well.
Some examples are:
Outdegree distribution
Indegree distribution
Distribution of the dependent behavior variable (if any).
Distribution of geodesic distances
Triad census
Edgewise homophily counts
Edgewise shared partner counts
Statistics depending on the combination of network and behavioral variables.
The function is written so that the user can easily define other functions to
capture some other relevant aspects of the network, behaviors, etc.
This is further illustrated in the help page
sienaGOFauxiliary
.
We recommend the following heuristic approach to model checking:
Check convergence of the estimation.
Assess time heterogeneity by sienaTimeTest
and if there is evidence for time heterogeneity
either modify the base effects or include time dummy terms.
Assess goodness of fit (primarily using join=TRUE
) on
auxiliary statistics, and if necessary refine the model.
The print
function will display some useful information
to help with model selection if some effects are set to FIX and
TEST on the effects object. A rough estimator for the Mahalanobis
distance that would be obtained at each proposed specification
is given in the output. This can help guide model selection.
This estimator is called the modified Mahalanobis distance (MMD).
See Lospinoso (2012), the manual, or the references for more information.
The following functions are prefabricated for ease of use, and can
be passed in as the
auxiliaryFunction
with no extra effort;
see sienaGOFauxiliary
and the examples below.
IndegreeDistribution
OutdegreeDistribution
BehaviorDistribution
sienaGOF
returns a result of class sienaGOF
;
this is a list of elements of class sienaGofTest
;
if join=TRUE
, the list has length 1; if join=FALSE
,
each list element corresponds to a period analyzed;
the list elements are themselves lists again, including the
following elements:
 sienaFitName 
The name of sienaFitObject. 
 auxiliaryStatisticName 
The name of auxiliaryFunction. 
 Observations 
The observed values for the auxiliary statistics. 
 Simulations 
The simulated auxiliary statistics. 
 ObservedTestStat 
The observed Mahalanobis distance in the data. 
 SimulatedTestStat 
The Mahalanobis distance for the simulations. 
 TwoTailed 
Whether the pvalue corresponds to a one or twotailed Monte Carlo test. 
 p 
The pvalue for the observed Mahalanobis distance in the permutation distribution of the simulated Mahalanobis distances. 
 Rank 
Rank of the covariance matrix of the simulated auxiliary statistics. 
In addition there are several attributes which give, for model
specifications with fixedandtested effects, approximations
to the expected Mahalanobis distance for model specifications where
each of these effects would be added. This is reported
in the summary
method.
The plot
method makes violin plots or box plots, with superimposed
confidence bands, for the simulated distributions of all elements
of the auxiliaryFunction
,
with the observed values indicated by red dots;
but statistics with variance 0 are dropped.
descriptives.sienaGOF
returns a matrix giving numerical
information about what is plotted in the plot
method:
maximum, upper percentile, mean, median, lower percentile, minimum,
of the simulated distributions of the auxiliary statistics, and the
observed values. If center=TRUE
the median is subtracted
from mean, median, and percentiles; if scale=TRUE
these numbers are divided by (maximum  minimum).
If showAll=FALSE
, statistics with variance 0 will be dropped.
Josh Lospinoso, modifications by Ruth Ripley and Tom Snijders
See http://www.stats.ox.ac.uk/~snijders/siena/ for general information on RSiena.
Lospinoso, J.A. and Snijders, T.A.B., “Goodness of fit for Stochastic Actor Oriented Models.” Presentation given at Sunbelt XXXI, St. Petes Beach, Fl. 2011.
Lospinoso, J.A. (2012). “Statistical Models for Social Network Dynamics.” Ph.D. Thesis. University of Oxford: U.K.
siena07
, sienaGOFauxiliary
,
sienaTimeTest
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74  mynet < sienaDependent(array(c(s501, s502), dim=c(50, 50, 2)))
mybeh < sienaDependent(s50a[,1:2], type="behavior")
mydata < sienaDataCreate(mynet, mybeh)
myeff < getEffects(mydata)
myeff < includeEffects(myeff, transTrip)
myeff < setEffect(myeff, cycle3, fix=TRUE, test=TRUE)
myeff < setEffect(myeff, transTies, fix=TRUE, test=TRUE)
myalgorithm < sienaAlgorithmCreate(nsub=1, n3=50)
# Shorter phases 2 and 3, just for example.
ans < siena07(myalgorithm, data=mydata, effects=myeff, batch=TRUE, returnDeps=TRUE)
gofi < sienaGOF(ans, IndegreeDistribution, verbose=TRUE, join=TRUE,
varName="mynet")
summary(gofi)
plot(gofi)
## Not run:
# Illustration just for showing a case with two dependent networks;
# running time backwards is not meaningful!
mynet1 < sienaDependent(array(c(s501, s502), dim=c(50, 50, 2)))
mynet2 < sienaDependent(array(c(s503, s501), dim=c(50, 50, 2)))
mybeh < sienaDependent(s50a[,1:2], type="behavior")
mydata < sienaDataCreate(mynet1, mynet2, mybeh)
myeff < getEffects(mydata)
myeff < includeEffects(myeff, transTrip)
myeff < includeEffects(myeff, recip, name="mynet2")
myeff < setEffect(myeff, cycle3, fix=TRUE, test=TRUE)
myeff < setEffect(myeff, transTies, fix=TRUE, test=TRUE)
myalgorithm < sienaAlgorithmCreate(nsub=1, n3=50)
# Shorter phases 2 and 3, just for example.
ans < siena07(myalgorithm, data=mydata, effects=myeff, batch=TRUE, returnDeps=TRUE)
gofi < sienaGOF(ans, IndegreeDistribution, verbose=TRUE, join=TRUE,
varName="mynet1")
summary(gofi)
plot(gofi)
## End(Not run)
## Not run:
(gofi.nc < sienaGOF(ans, IndegreeDistribution, cumulative=FALSE,
varName="mynet1"))
# cumulative is an example of "...".
plot(gofi.nc)
descriptives.sienaGOF(gofi.nc)
(gofi2 < sienaGOF(ans, IndegreeDistribution, varName="mynet2"))
plot(gofi2)
(gofb < sienaGOF(ans, BehaviorDistribution, varName = "mybeh"))
plot(gofb)
(gofo < sienaGOF(ans, OutdegreeDistribution, varName="mynet1",
levls=0:6, cumulative=FALSE))
# levls is another example of "...".
plot(gofo)
## End(Not run)
## A demonstration of using multiple processes
## Not run:
library(parallel)
(n.clus < detectCores()  1) # subtract 1 to keep time for other processes
myalgorithm.c < sienaAlgorithmCreate(nsub=4, n3=1000, seed=12345)
(ans.c < siena07(myalgorithm.c, data=mydata, effects=myeff, batch=TRUE,
returnDeps=TRUE, useCluster=TRUE, nbrNodes=n.clus))
gofi.1 < sienaGOF(ans.c, IndegreeDistribution, verbose=TRUE,
varName="mynet1")
cl < makeCluster(n.clus)
gofi.cl < sienaGOF(ans.c, IndegreeDistribution, varName="mynet1",
cluster=cl)
# compare simulation times
attr(gofi.1,"simTime")
attr(gofi.cl,"simTime")
## End(Not run)

Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.