predict.vb: Posterior Predictive Checks
In LaplacesDemon: Complete Environment for Bayesian Inference

Description Usage Arguments Details Value Author(s) See Also

This may be used to predict either new, unobserved instances of y (called y[new]) or replicates of y (called y[rep]), and then perform posterior predictive checks. Either y[new] or y[rep] is predicted given an object of class vb, the model specification, and data. This function requires that posterior samples were produced with VariationalBayes.

1 2	## S3 method for class 'vb' predict(object, Model, Data, CPUs=1, Type="PSOCK", ...)

`object`	An object of class `vb` is required.
`Model`	The model specification function is required.
`Data`	A data set in a list is required. The dependent variable is required to be named either `y` or `Y`.
`CPUs`	This argument accepts an integer that specifies the number of central processing units (CPUs) of the multicore computer or computer cluster. This argument defaults to `CPUs=1`, in which parallel processing does not occur.
`Type`	This argument specifies the type of parallel processing to perform, accepting either `Type="PSOCK"` or `Type="MPI"`.
`...`	Additional arguments are unused.

Since Variational Bayes characterizes marginal posterior distributions with modes and variances, and posterior predictive checks involve samples, the predict.vb function requires the use of independent samples of the marginal posterior distributions, provided by VariationalBayes when sir=TRUE.

The samples of the marginal posterior distributions of the target distributions (the parameters) are passed along with the data to the Model specification and used to draw samples from the deviance and monitored variables. At the same time, the fourth component in the returned list, which is labeled yhat, is a vector of expectations of y, given the samples, model specification, and data. To predict y[rep], simply supply the data set used to estimate the model. To predict y[new], supply a new data set instead (though for some model specifications, this cannot be done, and y[new] must be specified in the Model function). If the new data set does not have y, then create y in the list and set it equal to something sensible, such as mean(y) from the original data set.

The variable y must be a vector. If instead it is matrix Y, then it will be converted to vector y. The vectorized length of y or Y must be equal to the vectorized length of yhat, the fourth component of the returned list of the Model function.

Parallel processing may be performed when the user specifies CPUs to be greater than one, implying that the specified number of CPUs exists and is available. Parallelization may be performed on a multicore computer or a computer cluster. Either a Simple Network of Workstations (SNOW) or Message Passing Interface is used (MPI). With small data sets and few samples, parallel processing may be slower, due to computer network communication. With larger data sets and more samples, the user should experience a faster run-time.

For more information on posterior predictive checks, see https://web.archive.org/web/20150215050702/http://www.bayesian-inference.com/posteriorpredictivechecks.

This function returns an object of class vb.ppc (where “ppc” stands for posterior predictive checks). The returned object is a list with the following components:

`y`	This stores y, the dependent variable.
`yhat`	This is a N x S matrix, where N is the number of records of y and S is the number of posterior samples.
`Deviance`	This is a vector of length S, where S is the number of independent posterior samples. Samples are obtained with the sampling importance resampling algorithm, `SIR`.
`monitor`	This is a N x S matrix, where N is the number of monitored variables and S is the number of independent posterior samples. Samples are obtained with the sampling importance resampling algorithm, `SIR`.