Description Usage Arguments Details Value Author(s) References See Also Examples
Inputs functions to generate and analyze data. Compares output from these functions to test that the model-fitting software works correctly.
1 2 3 |
generate.param |
Function for generating parameters from prior distribution Should output a vector of parameters. Function should look like generate.param <- function() {...} or generate.param <- function(generate.param.inputs) {...} |
generate.param.inputs |
Inputs to the function generate.param |
generate.data |
Function for generating data given parameters. Should take as input the output from generate.param. Should output data matrix to be analyzed. Function should look like generate.data <- function(theta.true) {...} or generate.data <- function(theta.true, generate.data.inputs) {...} |
generate.data.inputs |
Inputs to the function generate.data (in addition to theta.true) |
analyze.data |
Function for generating sample from posterior distribution. Should take as input the output from generate.data and generate.param. Should output a matrix of parameters, each row is one parameter draw. Function should look like analyze.data <- function(data.rep,theta.true) {...} or analyze.data <- function(data.rep,theta.true, analyze.data.inputs) {...}. ! This is the software being tested !! |
analyze.data.inputs |
Inputs to the function analyze.data (in addition to data.rep and theta.true) |
n.rep |
Number of replications to be performed, default is 20. |
n.batch |
Lengths of parameter batches. A parameter batch might consist of, for example, all person-level means in a hierarchical model or any group of parameters that is sampled in a loop. Must sum to n.param (length of parameter vector) and correspond to the order of parameters as they are output from analyze.data. For example, if there are 5 total parameters with the first two in one batch and the next three in another batch, use n.batch=c(2,3) |
params.batch |
Names of parameter batches, used as the y axis in the output plot. Must have length equal to the number of batches. Can consist of text (e.g., params.batch=c("alpha","beta")) or an expression (e.g., params.batch=expression(alpha,beta)). Not used if n.batch not provided. |
print.reps |
Indcator of whether or not to print the replication number, default is FALSE |
Validate
tests whether
software developed to fit a specific Bayesian model works properly,
capitalizing on properties of Bayesian posterior distributions. The
validation method involves repeatedly generating parameters and data
from the model to be fit and then fitting the same model to these
simulated data (i.e., generating a sample from the posterior
distribution). For all scalar parameters, the quantile of the "true"
parameter value with respect to its posterior distribution should
follow a uniform distribution if the software is written correctly.
Testing that the software works amounts to testing that these
quantiles are uniformly distributed. For each scalar parameter, the
function gives a p-value for a test that its quantiles are uniformly distributed.
p.vals |
p-values for "unbatched" parameters. Null if all batches consist of only one parameter. |
p.batch |
p-values for the means of batched parameters. Null if n.batch not provided |
adj.min.p |
The smallest of p.batch (or, if p.batch=NULL, the smallest of p.vals), with the Bonferroni correction for multiple comparisons applied. |
Samantha Cook cook@stat.columbia.edu
http://www.stat.columbia.edu/~cook/Cook_Software_Validation.pdf
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 | set.seed(314)
##functions for generating parameters mu, sigma^2 from their prior distribution
rinvchisq <- function(n,v,s){
alpha <- v/2
beta <- alpha*s;
draws <- 1/rgamma(n,alpha,beta)
return(draws)}
generate.param <- function(hyper){
mu.0 <- hyper[1]
kappa.0 <- hyper[2]
nu.0 <- hyper[3]
sigsq.0 <- hyper[4]
sigsq <- rinvchisq(1, nu.0, sigsq.0)
mu <- rnorm(1, mu.0, sqrt(sigsq/kappa.0))
return(c(sigsq,mu))}
##generate normal data with mean mu, variance sigma^2, sample size n
generate.data <- function(params,n){
y <- rnorm(n,params[2],sqrt(params[1]))
return(y)}
##generate from the posterior distribution of mu, sigma^2
analyze.data <- function(y,params.true,inputs){
n <- length(y)
mu.0 <- inputs[1]
kappa.0 <- inputs[2]
nu.0 <- inputs[3]
sigsq.0 <- inputs[4]
n.draws <- inputs[5]
kappa.n <- kappa.0 + n
mu.n <- (mu.0*kappa.0 + sum(y))/kappa.n
nu.n <- nu.0 + n
sigsq.n <- ( nu.0*sigsq.0 + (n-1)*var(y) +
kappa.0*n*(mean(y) - mu.0)^2/kappa.n ) / nu.n
sigsq.post <- rinvchisq(n.draws, nu.n, sigsq.n)
var.mu.post <- sigsq.post/(kappa.n)
mu.post <- rnorm(n.draws, mu.n, sqrt(var.mu.post))
return(cbind(sigsq.post,mu.post))}
##generate from the posterior distribution of mu, sigma^2
##error sampling sigma^2
analyze.data.error1 <- function(y,params.true,inputs){
n <- length(y)
mu.0 <- inputs[1]
kappa.0 <- inputs[2]
nu.0 <- inputs[3]
sigsq.0 <- inputs[4]
n.draws <- inputs[5]
kappa.n <- kappa.0 + n
mu.n <- (mu.0*kappa.0 + sum(y))/kappa.n
nu.n <- nu.0 + n
sigsq.n <- ( nu.0*sigsq.0 + (n-1)*var(y) +
kappa.0*n*(mean(y) - mu.0)^2/kappa.n ) / nu.n
sigsq.post <- rinvchisq(n.draws, nu.n, sigsq.0)
var.mu.post <- sigsq.post/(kappa.n)
mu.post <- rnorm(n.draws, mu.n, sqrt(var.mu.post))
return(cbind(sigsq.post,mu.post))}
##generate from the posterior distribution of mu, sigma^2
##error sampling mu
analyze.data.error2 <- function(y,params.true,inputs){
n <- length(y)
mu.0 <- inputs[1]
kappa.0 <- inputs[2]
nu.0 <- inputs[3]
sigsq.0 <- inputs[4]
n.draws <- inputs[5]
kappa.n <- kappa.0 + n
mu.n <- (mu.0*kappa.0 + sum(y))/kappa.n
nu.n <- nu.0 + n
sigsq.n <- ( nu.0*sigsq.0 + (n-1)*var(y) +
kappa.0*n*(mean(y) - mu.0)^2/kappa.n ) / nu.n
sigsq.post <- rinvchisq(n.draws, nu.n, sigsq.n)
var.mu.post <- sigsq.post/(kappa.n)
mu.post <- rnorm(n.draws, mu.n, var.mu.post)
return(cbind(sigsq.post,mu.post))}
##function inputs
hyper<-c(6,5,5,7)
n<-20
n.draws<-5000
generate.param.inputs<-hyper
generate.data.inputs<-n
analyze.data.inputs<-c(hyper,n.draws)
##run validation function for the three model-fitting functions
tst.0 <- validate(generate.param = generate.param, generate.param.inputs =
generate.param.inputs, generate.data = generate.data,
generate.data.inputs = generate.data.inputs, analyze.data =
analyze.data, analyze.data.inputs = analyze.data.inputs,
n.rep = 20, params.batch = expression(sigma^2,mu), n.batch = c(1,1))
tst.1 <- validate(generate.param = generate.param, generate.param.inputs =
generate.param.inputs, generate.data = generate.data,
generate.data.inputs = generate.data.inputs, analyze.data =
analyze.data.error1, analyze.data.inputs = analyze.data.inputs,
n.rep = 20, params.batch = expression(sigma^2,mu), n.batch = c(1,1))
tst.2 <- validate(generate.param = generate.param, generate.param.inputs =
generate.param.inputs, generate.data = generate.data,
generate.data.inputs = generate.data.inputs, analyze.data =
analyze.data.error2, analyze.data.inputs = analyze.data.inputs,
n.rep = 20, params.batch = expression(sigma^2,mu), n.batch = c(1,1))
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.