Estimating posterior model probabilities
Description
Model selection with Approximate Bayesian Computation (ABC).
Usage
1 2 3 
Arguments
target 
a vector of the observed summary statistics. 
index 
a vector of model indices. It can be character or numeric and
will be coerced to factor. It must have the same length as

sumstat 
a vector, matrix or data frame of the simulated summary statistics. 
tol 
numeric, the required proportion of points nearest the target values (tolerance), or a vector of the desired tolerance values. If a vector is given 
subset 
a logical expression indicating elements or rows to keep. Missing
values in 
method 
a character string indicating the type of simulation required.
Possible values are 
corr 
logical, if 
kernel 
a character string specifying the kernel to be used when method is

numnet 
the number of neural networks when method is 
sizenet 
the number of units in the hidden layer. Can be zero if there are no skiplayer units. 
lambda 
a numeric vector or a single value indicating the weight decay when
method is 
trace 
logical, 
maxit 
numeric, the maximum number of iterations. Defaults to 500. See also

... 
other arguments passed on from 
Details
The function computes the posterior model probabilities. Simulations
have to be performed with at least two distinct models. When method is
"rejection"
, the posterior probability of a given model is
approximated by the proportion of accepted simulations given this
model. This approximation holds when the different models are a priori
equally likely, and the same number of simulations is performed for
each model. When method is "mnlogistic"
the posterior model
probabilities are estimated using a multinomial logistic regression as
implemented in the function multinom
from the package
nnet
. When method is "neuralnet"
, neural networks
are used to predict the probabilities of models based on the observed
statistics using nnet
. This method can be useful if many
summary statistics are used.
Names for the summary statistics are strongly recommended. Names can
be supplied as colnames to sumstat
(and target). If no names are
supplied S1, S2, ... to summary statistics will be assigned to
parameters and the user will be warned.
Value
An object of class
"postpr"
, containing the following
components:
pred 
a vector of model probabilities when method is

values 
the vector of model indices in the accepted region using the rejection method. 
weights 
vector of regression weights when method is

ss 
summary statistics in the accepted region. 
call 
the original call. 
na.action 
a logical vector indicating the elements or rows that
were excluded, including both 
method 
a character string indicating the method used, i.e.

corr 
logical, if 
nmodels 
the number of simulations performed for each model a priori. 
models 
a character vector of model names (a priori). 
numstat 
the number of summary statistics used. 
names 
a list of two elements: 
Author(s)
Katalin Csillery, Olivier Francois and Michael Blum with some initial code from Mark Beaumont.
References
Beaumont, M.A. (2008) Joint determination of topology, divergence time, and immigration in population trees. In Simulation, Genetics, and Human Prehistory (Matsumura, S., Forster, P. and Renfrew, C., eds) McDonald Institute for Archaeological Research
See Also
summary.postpr
Examples
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31  ## an artifical example
ss < cbind(runif(1000),rt(1000,df=20))
postpr(target=c(3), index=c(rep("norm",500),rep("t",500)),
sumstat=ss[,1], tol=.1, method="rejection")
## human demographic history
require(abc.data)
data(human)
## five R objects are loaded. See ?human and vignette("abc") for details.
## illustrate the informativeness of two summary statistics: mean and
## variance of Tajima's D
par(mfcol = c(1,3))
boxplot(stat.3pops.sim[,"pi"]~models, main="Mean nucleotide diversity")
boxplot(stat.3pops.sim[,"TajD.m"]~models, main="Mean Tajima's D")
boxplot(stat.3pops.sim[,"TajD.v"]~models, main="Var in Tajima's D")
## model selection with ABC for the European population
modsel.it < postpr(stat.voight["italian",], models, stat.3pops.sim, tol=.05, method="mnlogistic")
summary(modsel.it)
## In Europe, the most supported model
## is a population bottleneck
## Check that in Africa, population expansion is the most supported model, while in
## Asia, it is a population bottleneck
##modsel.ha < postpr(stat.voight["hausa",], models, stat.3pops.sim,
##tol=.05, method="mnlogistic")
##modsel.ch < postpr(stat.voight["chinese",], models, stat.3pops.sim,
## tol=.05, method="mnlogistic")
