sienaBayes: A function for fitting Bayesian models

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/sienaBayes.r

Description

A function to fit hierarchical Bayesian models random effects to sienaGroup data objects. Uses the function maxlikec for the SAOM part, the Bayesian part is performed in R.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
sienaBayes(data, effects, algo,  saveFreq=100,
    initgainGlobal=0.1, initgainGroupwise = 0.02, initfgain=0.2, gamma=0.05,
    initML=FALSE, priorSigEta=NULL,
    priorMu=NULL, priorSigma=NULL, priorDf=NULL, priorKappa=NULL,
    priorRatesFromData=2,
    frequentist=FALSE, incidentalBasicRates=FALSE,
    reductionFactor=0.5, delta=1e-04,
    nprewarm=50, nwarm=50, nmain=250, nrunMHBatches=20,
    nSampVarying=1, nSampConst=1, nSampRates=0,
    nImproveMH=100, targetMHProb=0.25,
    lengthPhase1=round(nmain/5), lengthPhase3=round(nmain/5),
    storeAll=FALSE, prevAns=NULL, usePrevOnly=TRUE,
    prevBayes=NULL, newProposalFromPrev=(prevBayes$nwarm >= 1),
    silentstart = TRUE,
    nbrNodes=1, clusterType=c("PSOCK", "FORK"),
    getDocumentation=FALSE)

glueBayes(z1, z2, nwarm2=0)

Arguments

data

A sienaGroup object as returned by sienaGroupCreate. It is planned to also allow a siena data object as returned by sienaDataCreate.

effects

sienaEffects object as returned by getEffects(data). The effects indicated by its column randomEffects get effects that are varying across the groups.

algo

Algorithm object, as created by sienaAlgorithmCreate. Should contain all options required for the MCMC scheme, and a random seed if required.

saveFreq

Integer. If this is larger than 1, the provisional results are saved after each multiple of saveFreq iterations in the main phase, in a file with name PartialBayesResult.RData (if a file with this name exists, it will be overwritten). This file contains an object z of class sienaBayesFit, with the provisional results. This is to guard against crashes or power failure. It can be used as value for prevBayes as indicated below.

initgainGlobal

Step sizes in initial searches for good parameter values across the groups.

initgainGroupwise

Step sizes in initial searches for good parameter values by group; can be up to 0.1 for larger networks, 0 for very small networks.

initfgain

Positive number, used only for frequentist estimation and incidentalBasicRates: the gain factor in the Robbins Monro algorithm is initfgain during warming and Phase 1, and initfgain * ((iteration number after Phase 1)^(-gamma)) after that.

gamma

Positive number, used only for frequentist estimation: see initfgain.

initML

Boolean, whether to use maximum likelihood estimation for the groupwise initial estimation.

priorSigEta

Vector of length equal to the number of fixed parameters, or NULL. If not NULL, values that are not NA are prior variances of eta (fixed effects); these must all be positive.

priorMu

Vector of length equal to the number of randomly varying parameters, or NULL. Prior mean of mu (global population mean for varying parameters); default: 0. For the basic rate parameters the defaults are data-dependent, but if ((priorRatesFromData = 0) and priorMu=NULL), for these parameters the prior mean will be 2.

priorSigma

Square matrix of dimension equal to length of priorMu, or NULL. Prior global population covariance matrix Sigma for the varying parameters; default: identity matrix.

priorDf

Prior degrees of freedom for Sigma (global population covariance matrix); default: number of randomly varying parameters + 2.

priorKappa

Proportionality constant between prior covariance matrix and covariance matrix of prior distribution for mu; default: 1.

priorRatesFromData

-1, 0, 1, or 2. Determines the prior distribution for the rate parameters.
-1: all basic rate parameters are fixed; this must correspond to effects;
0: prior is taken from priorMu and priorSigma;
1: prior is defined by mean and covariance matrix of estimated rate parameters from initialization phase;
2: prior is defined by robust estimates for location and multivariate scale of estimated rate parameters from initialization phase.

frequentist

Currently only frequentist=FALSE works. Boolean: chooses between frequentist or Bayesian estimation of the global parameters. Frequentist estimation is possible only for at least 2 groups.

incidentalBasicRates

Boolean. If this is TRUE, the basic rate parameters are defined specifically for each group, and estimated using a Robbins Monro algorithm; if FALSE, they have a common prior for all groups.

reductionFactor

Positive number. If priorRatesFromData = 1 or 2, the prior distribution for the rate parameters is the estimated covariance matrix of the estimated rate parameters in the initialization phase multiplied by reductionFactor, plus a contribution of 0.01 for the variances. For many data sets the default will be OK. If the rate parameters diverge, as evidenced by the traceplots of the posterior values, then a smaller value is advisable, such as 0.01. Has no effect if priorRatesFromData <= 0.

delta

When the global population covariance matrix becomes non-positive definite (i.e., has one or more negative correlations) during iterations, it is changed so that all correlations are at least delta.

nprewarm

Number of iterations in the pre-warm-up phase, before improveMH.

nwarm

Number of iterations in the warm-up phase. Used only if prevBayes is NULL. Then it should be at least 5.

nmain

Number of iterations in the main phase. Should be at least 10.

nrunMHBatches

Integer: thinning ratio in MCMC process; but thinning at the lowest level is further determined by parameter mult in algo, see sienaAlgorithmCreate. Should be even.

nSampVarying

Number of samples of varying parameters for each chain sample.

nSampConst

Number of samples of constant parameters ("eta") for each chain sample.

nSampRates

Number of extra samples of basic rate parameters for each chain sample.

nImproveMH

Number of iterations per improveMH step. If nImproveMH=0, no improveMH steps at all.

targetMHProb

Desired proportion of acceptances in MH steps. This can be one number, or a vector of two numbers. In the latter case, the first number applies to the MH steps per group (for each group), the second to the MH steps for the constant parameters ("eta").

lengthPhase1

Only used for frequentist estimation or incidentalBasicRates: length of the first phase of the Robbins Monro algorithm. lengthPhase1 + lengthPhase3 should be strictly less than nmain.

lengthPhase3

Only used for frequentist estimation or incidentalBasicRates: length of the third phase of the Robbins Monro algorithm. lengthPhase1 + lengthPhase3 should be strictly less than nmain.

storeAll

Boolean: whether to store parameters for all MCMC iterations, i.e., before thinning. storeAll=TRUE may lead to producing very large objects and is not recommended for usual operation.

prevAns

An object of class "sienaFit" as returned by siena07 for the same data set and effects object. This then is the result of a multi-group estimation for these data, from which scaling information (derivative matrix and standard deviation of the deviations) can be extracted along with the parameters estimates which will be used as the initial values, unless algo requests the use of standard initial values.
If prevAns=NULL, then a multi-group estimation using siena07 will be performed as part of the initialization. If prevAns is an object as described, estimated with unconditional estimation, with 500 or more runs in Phase 3, then if usePrevOnly=TRUE this multi-group estimation will be skipped; if usePrevOnly=FALSE a multi-group estimation still will be performed, but starting at the results provided by prevAns; if usePrevOnly=TRUE but with less than 500 runs in Phase 3, then the multi-group estimation will be done only for Phase 3 (skipping Phases 1 and 2).

usePrevOnly

Boolean: see immediately above.

prevBayes

An object of class sienaBayes as returned by function sienaBayes, on which the current function will continue. For example, the object z contained in the intermediate saved result PartialBayesResult.RData as mentioned above. Initialization and warming phases are skipped.
If these are given, the values of nrunMHBatches, nSampVarying, nSampConst, and nSampRates supersede those in prevBayes. If these are given, the values of prevAns, nwarm, incidentalBasicRates, priorRatesFromData, and the prior distributions are disregarded.
A new set of nmain iterations in the main phase are done. All further relevant parameters of the function call should be identical. This is not completely checked, so errors may occur if this is not the case.

newProposalFromPrev

Boolean, has consequences only when prevBayes is given: whether to use a robust estimate of the covariance matrix of parameters in prevBayes to define a new proposal distribution. Else the proposal distribution is the same as in the prevBayes object.

silentstart

Boolean: whether to suppress most information to the console during the calculation of initial values.

nbrNodes

Number of processes to be used. Cannot be more than the number of periods summed over number of groups.

clusterType

If using multiple processes, whether to use forking processes or not. (Only "PSOCK" can be used on Windows.)

getDocumentation

Flag to allow documentation of internal functions, not for use by users.

z1

sienaBayes object.

z2

sienaBayes object with the same data and model specification as z1.

nwarm2

Number of warming iterations in z2; the first nwarm2 iterations of z2 will be left out of the combined object.

Details

The function sienaBayes is for Bayesian estimation of one group or of multiple groups all having the same number of waves and the same model specification. It wraps Bayesian sampling of parameters around calls to maxlikec. The RSiena manual has a lot of information.
Effects can be randomly varying between groups, or global (i.e., constant across groups). This is indicated by the keyword random in the call of setEffect, and reported when using the keyword includeRandoms in print.sienaEffects.
For the groupwise parameters normal distributions are assumed with conjugate priors. The conjugate prior is an inverse Wishart distribution for the covariance matrix Sigma, with parameters Lambda^{-1} and priorDf, where Lambda = priorDf*priorSigma; and, conditional on Sigma, for the expected value a multivariate normal with mean priorMu and covariance matrix Sigma/priorKappa.
For the fixed parameters, if priorSigEta=NULL, the improper constant prior is used. Else, for the components of eta (fixed effects parameter) for which priorEta is NA, the improper constant prior is used, and for other components, independent normal priors with mean 0 and variances given by priorEta.
Recommendation: use the normal prior only for effects of group-level variables. If a non-zero prior mean is desired, transform the group-level variable.
The required dimensions of the prior parameters priorSigEta, priorMu and priorSigma depend on the number of groupwise varying parameters and are given in the output for print.sienaEffects when using the keyword includeRandoms .
If priorRatesFromData is 1 or 2, the prior distribution for the basic rate parameters is determined in a data-dependent way. For the non-varying parameters, a flat prior is assumed.
The frequentist option currently is not supported (probably broken).
The procedure consists of three parts: initialization, warming, main phase.
In the initialization phase, initial parameter values and the proposal covariance matrix for Metropolis-Hastings steps for groupwise parameters are obtained from, first, Method of Moments estimation of a parameter vector assumed to be the same across the groups (in a multi-group estimation); replacing this estimate by a precision-weighted mean of this estimate and the prior mean; followed by one subphase of the Robbins-Monro algorithm for Method of Moments estimates for the groups separately, starting from the overall estimate, with step size initgain. This is skipped if initgain=0.
If the prevBayes object is supplied, this initialization phase is skipped, and if newProposalFromPrev the proposal covariance matrices are calculated from the simulated chains in this object.
From this basis, nprewarm Metropolis Hastings steps are taken; however, if no acceptances occur in two consecutive steps, the prewarming phase is ended. The proposal covariance matrices then are scaled, in the function 'improveMH', to achieve a proportion of accepted MH proposals approximately equal to target.
After initialization and scaling of the proposal covariance matrices, a warming phase is done of nwarm Bayesian proposals each with a number of MH steps, followed again by the function 'improveMH'. If the prevBayes object is supplied, the warming phase is skipped.
Finally nmain repeats of nrunMHBatches are performed of a number of MH steps sampling chains, plus nSampVarying MH steps sampling the varying parameters ('theta_j') plus nSampConst MH steps sampling the non-varying parameters ('eta') plus one Gibbs step sampling the global mean and covariance matrix of the varying parameters ('mu' and 'Sigma'). In the warming as well as the final phase, the number of MH steps within each run is determined by parameter mult ("multiplication factor") in the algorithm object algo.

The function is time-consuming. When starting to use it, it is advisable to start with low values of nmain to explore computing time.
Initialisation is an important part, and will be shorter if prevAns is given. If an earlier sienaBayes object is available for this data specification (which may differ in which effects are specified as being random), then if the name of this object is zz it will be possible to use prevAns=zz$initialResults.

When the procedure seems to diverge, and for very small groups, it is advisable to use smaller values of the parameters initgainGlobal and initgainGroupwise.

glueBayes(z1,z2) combines two existing sienaBayesFit objects with the same data and model specifications (this is not entirely checked) into one by putting z2 after z1. This is intended to be used when z2 was produced by calling sienaBayes with prevBayes=z1.

For the function print.sienaEffects, the options includeRandoms=TRUE and dropRates=TRUE are meant to be useful especially for effects objects used for sienaBayes.

Value

sienaBayes as well as glueBayes return an object of class sienaBayesFit. This is a list containing, among other things:

priorMu

prior global population mean (not quite the same as corresponding input parameter)

priorSigma

prior global population covariance matrix (not quite the same as corresponding input parameter)

priorKappa

proportionality constant between prior covariance matrix and covariance matrix of prior distribution for the mean

priorDf

prior degrees of freedom for covariance matrix

nmain

Number of iterations used in the main phase, as given in the call of sienaBayes

nwarm

Number of iterations used in the warm-up phase, as given in the call of sienaBayes; 0 if prevBayes was used. The value of nwarm is used in other functions such as print.sienaBayesFit, and when the object is used for sienaBayes as a prevBayes parameter with newProposalFromPrev=TRUE. If it was deemed that converge begins later (or earlier) than the value of nwarm, this value can be changed for further use of the object produced by sienaBayes.

effectName

array of names of effects included in the model

f$groupNames

array of names of groups included in the model

initialResults

sienaFit object: result of initial MoM estimation under the assumption of same parameters across groups

ThinParameters

array of dimensions (nwarm+nmain iterations by parameters by groups): sampled groupwise parameters

ThinPosteriorMu

array of dimensions (nwarm+nmain iterations by parameters): sampled global means of varying parameters

ThinPosteriorEta

array of dimensions (nwarm+nmain iterations by fixed parameters): sampled fixed parameters

ThinPosteriorSigma

array of dimensions (nwarm+nmain iterations by parameters by parameters): sampled global covariance matrix of varying parameters

nrunMH

vector (by groups * waves) of number of MH steps in the innermost loop of the likelihood simulations of the SAOM; this is a data-dependent multiple of the value of mult set in sienaAlgorithmCreate

ThinBayesAcceptances

matrix of number of acceptances in the nrunMHBatches MH steps in each iteration, and summed over dependent variables, for each group and (duplicated) for the vector of fixed effects

acceptances

if storeAll=TRUE: matrix of booleans: whether the corresponding change to the parameters was accepted in each step, by group

MHacceptances

if storeAll=TRUE: array of acceptances of the MH steps, by step type and group but summed over dependent variables

MHrejections

if storeAll=TRUE: array of rejections of the MH steps

MHproportions

if storeAll=TRUE: array of proportions of the MH steps accepted

Author(s)

Ruth Ripley, Johan Koskinen, Tom Snijders

References

See http://www.stats.ox.ac.uk/~snijders/siena/

Koskinen, J.H. and T.A.B. Snijders (2007). Bayesian inference for dynamic social network data. Journal of Statistical Planning and Inference, 13, 3930-3938.

See Also

siena07, sienaGroupCreate, bayesTest, extract.sienaBayes, plotPostMeansMDS
There are print and summary methods for sienaBayesFit objects, see print.sienaBayesFit.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
  Group1 <- sienaDependent(array(c(N3401, HN3401), dim=c(45, 45, 2)))
  Group3 <- sienaDependent(array(c(N3403, HN3403), dim=c(37, 37, 2)))
  Group4 <- sienaDependent(array(c(N3404, HN3404), dim=c(33, 33, 2)))
  Group6 <- sienaDependent(array(c(N3406, HN3406), dim=c(36, 36, 2)))
  dataset.1 <- sienaDataCreate(Friends = Group1)
  dataset.3 <- sienaDataCreate(Friends = Group3)
  dataset.4 <- sienaDataCreate(Friends = Group4)
  dataset.6 <- sienaDataCreate(Friends = Group6)
  FourGroups <- sienaGroupCreate(
        list(dataset.1, dataset.3, dataset.4, dataset.6))
  FourEffects <- getEffects(FourGroups)
  FourEffects <- includeEffects(FourEffects, transTrip)
  FourEffects <- setEffect(FourEffects, density, random=TRUE)
  FourEffects <- setEffect(FourEffects, recip, random=TRUE)
  print(FourEffects, includeRandoms=TRUE)
# Note this also shows the "randomEffects" column
# and the dimensions of the objects for specifying the priors.
  print(FourEffects, includeRandoms=TRUE, dropRates=TRUE)
# Note this does not show the rate effects.
  FourAlgo <- sienaAlgorithmCreate(projname = "FourGroups", maxlike=TRUE,
                                   seed=123)
## Not run: 
  bayes.model <- sienaBayes(FourAlgo, data = FourGroups,
       effects = FourEffects, nprewarm=10, nwarm=10, nmain=25, nrunMHBatches=10)
  bayes.model
  print(bayes.model, nfirst=20)
  summary(bayes.model)
  bayes.nextModel <- sienaBayes(FourAlgo, data = FourGroups,
       effects = FourEffects, nmain=15, nrunMHBatches=10,
       prevBayes = bayes.model)
  bayes.combinedModel <- glueBayes(bayes.model, bayes.nextModel)
  summary(bayes.combinedModel)

## End(Not run)

RSienaTest documentation built on July 14, 2021, 3 a.m.