predict.mcdraws: Generate draws from the predictive distribution

View source: R/prediction.R

predict.mcdrawsR Documentation

Generate draws from the predictive distribution

Description

Generate draws from the predictive distribution

Usage

## S3 method for class 'mcdraws'
predict(
  object,
  newdata = NULL,
  X. = if (is.null(newdata)) "in-sample" else NULL,
  type = c("data", "link", "response", "data_cat"),
  var = NULL,
  ny = NULL,
  ry = NULL,
  fun. = identity,
  labels = NULL,
  ppcheck = FALSE,
  iters = NULL,
  to.file = FALSE,
  filename,
  write.single.prec = FALSE,
  show.progress = TRUE,
  verbose = TRUE,
  n.cores = 1L,
  cl = NULL,
  seed = NULL,
  export = NULL,
  ...
)

Arguments

object

an object of class mcdraws, as output by MCMCsim.

newdata

data frame with auxiliary information to be used for prediction.

X.

a list of design matrices; alternatively, X. equals 'in-sample' or 'linpred'. If 'in-sample' (the default if newdata is not supplied), the design matrices for in-sample prediction are used. If 'linpred' the 'linpred_' component of object is used.

type

the type of predictions. The default is "data", meaning that new data is generated according to the predictive distribution. If type="link" only the linear predictor for the mean is generated, and in case type="response" the linear predictor is transformed to the response scale. For Gaussian models type="link" and type="response" are equivalent. For binomial and negative binomial models type="response" returns the simulations of the latent probabilities. For multinomial models type="link" generates the linear predictor for all categories except the last, and type="response" transforms this vector to the probability scale, and type="data" generates the multinomial data, all in long vector format, where the output for all categories (except the last) are stacked. For multinomial models and single trials, a further option is type="data_cat", which generates the data as a categorical vector, with integer coded levels.

var

variance(s) used for out-of-sample prediction. By default 1.

ny

number of trials for used for out-of-sample prediction in case of a binomial model. By default 1.

ry

fixed part of the (reciprocal) dispersion parameter in case of a negative binomial model.

fun.

function applied to the vector of posterior predictions to compute one or multiple summaries or test statistics. The function can have one or two arguments. The first argument is always the vector of posterior predictions. The optional second argument represents a list of model parameters, needed only when a test statistic depends on them. The function must return an integer or numeric vector.

labels

optional names for the output object. Must be a vector of the same length as the result of fun..

ppcheck

if TRUE, function fun. is also applied to the observed data and an MCMC approximation is computed of the posterior predictive probability that the test statistic for predicted data is greater than the test statistic for the observed data.

iters

iterations in object to use for prediction. Default NULL means that all draws from object are used.

to.file

if TRUE the predictions are streamed to file.

filename

name of the file to write predictions to in case to.file=TRUE.

write.single.prec

Whether to write to file in single precision. Default is FALSE.

show.progress

whether to show a progress bar.

verbose

whether to show informative messages.

n.cores

the number of cpu cores to use. Default is one, i.e. no parallel computation. If an existing cluster cl is provided, n.cores will be set to the number of workers in that cluster.

cl

an existing cluster can be passed for parallel computation. If NULL and n.cores > 1, a new cluster is created.

seed

a random seed (integer). For parallel computation it is used to independently seed RNG streams for all workers.

export

a character vector with names of objects to export to the workers. This may be needed for parallel execution if expressions in fun. depend on global variables.

...

currently not used.

Value

An object of class dc, containing draws from the posterior (or prior) predictive distribution. If ppcheck=TRUE posterior predictive p-values are returned as an additional attribute. In case to.file=TRUE the file name used is returned.

Examples


n <- 250
dat <- data.frame(x=runif(n))
dat$y <- 1 + dat$x + rnorm(n)
sampler <- create_sampler(y ~ x, data=dat)
sim <- MCMCsim(sampler)
summary(sim)
# in-sample prediction
pred <- predict(sim, ppcheck=TRUE)
hist(attr(pred, "ppp"))
# out-of-sample prediction
pred <- predict(sim, newdata=data.frame(x=seq(0, 1, by=0.1)))
summary(pred)



mcmcsae documentation built on Oct. 11, 2023, 1:06 a.m.