Cross validation for Approximate Bayesian Computation (ABC)

Description

This function performs a leave-one-out cross validation for ABC via subsequent calls to the function abc. A potential use of this function is to evaluate the effect of the choice of the tolerance rate on the quality of the estimation with ABC.

Usage

1
2
3
4
cv4abc(param, sumstat, abc.out = NULL, nval, tols, statistic = "median",
prior.range = NULL, method, hcorr = TRUE, transf = "none", logit.bounds
= c(0,0), subset = NULL, kernel = "epanechnikov", numnet = 10, sizenet =
5, lambda = c(0.0001,0.001,0.01), trace = FALSE, maxit = 500, ...)

Arguments

param

a vector, matrix or data frame of the simulated parameter values.

sumstat

a vector, matrix or data frame of the simulated summary statistics.

abc.out

an object of class "abc", optional. If supplied, all arguments passed to abc are extracted from this object, except for sumstat, param, and tol, which always have to be supplied as arguments.

nval

size of the cross-validation sample.

tols

a single tolerance rate or a vector of tolerance rates.

statistic

a character string specifying the statistic to calculate a point estimate from the posterior distribution of the parameter(s). Possible values are "median" (default), "mean", or "mode".

prior.range

a range to truncate the prior range.

method

a character string indicating the type of ABC algorithm to be applied. Possible values are "rejection", "loclinear", and "neuralnet". See also abc.

hcorr

logical, if TRUE (default) the conditional heteroscedastic model is applied.

transf

a vector of character strings indicating the kind of transformation to be applied to the parameter values. The possible values are "log", "logit", and "none" (default), when no is transformation applied. See also abc.

logit.bounds

a vector of bounds if transf is "logit". These bounds are applied to all parameters that are to be logit transformed.

subset

a logical expression indicating elements or rows to keep. Missing values in param and/or sumstat are taken as FALSE.

kernel

a character string specifying the kernel to be used when method is "loclinear" or "neuralnet". Defaults to "epanechnikov". See density for details.

numnet

the number of neural networks when method is "neuralnet". Defaults to 10. It indicates the number of times the function nnet is called.

sizenet

the number of units in the hidden layer. Defaults to 5. Can be zero if there are no skip-layer units. See nnet for more details.

lambda

a numeric vector or a single value indicating the weight decay when method is "neuralnet". See nnet for more details. By default, 0.0001, 0.001, or 0.01 is randomly chosen for each of the networks.

trace

logical, TRUE switches on tracing the optimization of nnet. Applies only when method is "neuralnet".

maxit

numeric, the maximum number of iterations. Defaults to 500. Applies only when method is "neuralnet". See also nnet.

...

other arguments passed to nnet.

Details

A simulation is selected repeatedly to be a validation simulation, while the other simulations are used as training simulations. Each time the function abc is called to estimate the parameter(s). A total of nval validation simulations are selected.

The arguments of the function abc can be supplied in two ways. First, simply give them as arguments when calling this function, in which case abc.out can be NULL. Second, via an existing object of class "abc", here abc.out. WARNING: when abc.out is supplied, the same sumstat and param objects have to be used as in the original call to abc. Column names of sumstat and param are checked for match.

See summary.cv4abc for calculating the prediction error from an object of class "cv4abc".

Value

An object of class "cv4abc", which is a list with the following elements

call

The original calls to abc for each tolerance rates.

cvsamples

Numeric vector of length nval, indicating which rows of the param and sumstat matrices were used as validation values.

tols

The tolerance rates.

true

The parameter values that served as validation values.

estim

The estimated parameter values.

names

A list with two elements: parameter.names and statistics.names. Both contain a vector of character strings with the parameter and statistics names, respectively.

seed

The value of .Random.seed when cv4abc is called.

See Also

abc, plot.cv4abc, summary.cv4abc

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
require(abc.data)
data(musigma2)
## this data set contains five R objects, see ?musigma2 for
## details 

## cv4abc() calls abc(). Here we show two ways for the supplying
## arguments of abc(). 1st way: passing arguments directly. In this
## example only 'param', 'sumstat', 'tol', and 'method', while default
## values are used for the other arguments.
## Number of eval. should be much more greater in realistic settings
cv.rej <- cv4abc(param=par.sim, sumstat=stat.sim, nval=5,
tols=c(.1,.2,.3), method="rejection")

## 2nd way: first creating an object of class 'abc', and then using it
## to pass its arguments to abc().
##
lin <- abc(target=stat.obs, param=par.sim, sumstat=stat.sim, tol=.2,
method="loclinear", transf=c("none","log"))
cv.lin <- cv4abc(param=par.sim, sumstat=stat.sim, abc.out=lin, nval=5,
tols=c(.1,.2,.3))

## using the plot method. Different tolerance levels are plotted with
## different heat.colors. Smaller the tolerance levels correspond to
## "more red" points.
## !!! consider using the argument 'exclude' (plot.cv4abc) to supress
## the plotting of any outliers that mask readibility !!!
plot(cv.lin, log=c("xy", "xy"), caption=c(expression(mu),
expression(sigma^2)))

## comparing with the rejection sampling
plot(cv.rej, log=c("", "xy"), caption=c(expression(mu), expression(sigma^2)))

## or printing results directly to a postscript file...
plot(cv.lin, log=c("xy", "xy"), caption=c(expression(mu),
expression(sigma^2)), file="CVrej", postscript=TRUE)

## using the summary method to calculate the prediction error
summary(cv.lin)
## compare with rejection sampling
summary(cv.rej)