Summary statistics selection by minimizing a posterior sample measure.

Description

The function cycles through all possible subsets of summary statistics and computes a criterion from the posterior sample. The subset which achieves the minimum is chosen as the most informative subset.

Usage

1
2
3
4
mincrit(obs, param, sumstats, obspar = NULL, abcmethod = abc,
crit = nn.ent, sumsubs = 1:ncol(sumstats), limit=length(sumsubs), 
do.only = NULL, verbose = TRUE, do.crit = TRUE, do.err = FALSE, 
final.dens = FALSE, errfn = rsse, ...)

Arguments

obs

(matrix of) observed summary statistics.

param

matrix of simulated model parameter values.

sumstats

matrix of simulated summary statistics.

obspar

optional observed parameters (for use to assess simulation performance).

abcmethod

a function to perform ABC inference, e.g. the abc function from package abc.

crit

a function to minimize to measure information from a posterior sample, e.g. nn.ent.

sumsubs

an optional index into the summary statistics to limit summary selection to a specific subset of summaries.

limit

an optional integer indicating whether to limit summary selection to subsets of a maximum size.

do.only

an optional index into the summary statistics combination table. Can be used to limit entropy calculations to certain summary statistics subsets only.

verbose

a boolean value indicating whether informative statements should be printed to screen.

do.crit

a boolean value indicating whether the measure on the posterior sample should be returned.

do.err

a boolean value indicating whether the simulation error should be returned. Note: if do.err=TRUE, obspar must be supplied.

final.dens

a boolean value indicating whether the posterior sample should be returned.

errfn

an error function to assess ABC inference performance.

...

any other optional arguments to the ABC inference procedure (e.g. arguments to the abc function).

Details

The function uses a criterion (e.g.sample entropy) as a proxy for information in a posterior sample. The criterion for each possible subset of statistics is computed, and the best subset is judged as the one which minimises this vector of values.

Value

A list with the following components:

best

the best subset(s) of statistics.

critvals

the calculated criterion values (if do.crit=TRUE).

err

simulation error (if obspar is supplied and do.err=TRUE).

order

the subsets considered during the algorithm (same as the input do.only.

post.sample

an array of dimension nacc x npar x ndatasets giving the posterior sample for each observed dataset. Not returned if final.dens=FALSE.

sumsubs

an index into the subsets considered during the algorithm.

Warning

These functions are computationally intensive due to the cyclic ABC inference procedure.

Author(s)

Matt Nunes

References

Nunes, M. A. and Balding, D. J. (2010) On Optimal Selection of Summary Statistics for Approximate Bayesian Computation. Stat. Appl. Gen. Mol. Biol. 9, Iss. 1, Art. 34.

Nunes, M. A. and Prangle, D. (2016) abctools: an R package for tuning approximate Bayesian computation analyses. The R Journal 7, Issue 2, 189–205.

See Also

nn.ent

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
# load example data:

data(coal)
data(coalobs)

param<-coal[,2]
simstats<-coal[,4:6]

# use matrix below just in case to preserve dimensions.

obsstats<-matrix(coalobs[1,4:6],nrow=1)
obsparam<-matrix(coalobs[1,1])

# example of entropy minimization algorithm:

tmp <-mincrit(obsstats, param, simstats, tol=.01, method="rejection", 
do.crit=TRUE)

tmp$critvals