In blockDemac: parallel DEMC with several parameter blocks

library(knitr)
opts_chunk$set(out.extra='style="display:block; margin: auto"'
    #, fig.align="center"
    , fig.width=4.3, fig.height=3.2, dev.args=list(pointsize=10)
    )
knit_hooks$set(spar = function(before, options, envir) {
    if (before){
        par( las=1 )                   #also y axis labels horizontal
        par(mar=c(2.0,3.3,0,0)+0.3 )  #margins
        par(tck=0.02 )                          #axe-tick length inside plots             
        par(mgp=c(1.1,0.2,0) )  #positioning of axis title, axis labels, axis
     }
})

Simple example of inverting a model based on a single cost function

Preparing the used model and the data

First, the library is loaded.

if(!exists("isDevelopMode")) library(blockDemac)
set.seed(0815)      # for reproducible results

The simple example model is just a linear regression. The model functions takes two arguments, a vector theta with entries "a" and "b", and a vector xval with the values of the predictor. The output is a vector of predictions.

simpleModel <- function(
    ### example model function: y=a+bx
    theta,  ##<< parameter vector with names a and b
    xval    ##<< covariate vector
){ 
    theta["a"] + theta["b"]*xval 
}

We want to infer the posterior distribution of parameters when inverting the model using artificial data. First, the artificial data is generated by using the model with known true parameters and addeing Gaussian noise.

thetaTrue = c(a=10,b=5) # the true parameter vector
xval <- runif(30,min=5,max=10)  # the covariates
sdObs = 0.5*xval^0.8   # heteroscedastic: increase with covariate
obs = simpleModel(thetaTrue, xval) + rnorm(length(xval), sd=sdObs)
plot( obs ~ xval )

Preparing the misfit function (logDensity)

The most important ingredient in the inversion is the logDensity function, which tells how probable a parameter vector is, given the data and the prior information. Here, we assume a Gaussian measurement error of the data, and specify a function returning the log of a normal density: -1/2 times the sum of squared normalized residuals. This is one of the most widely used cost functions. For simplicity, we assume to have no prior information.

    logDenGaussianSimpleModel <- function( theta, intermediates, logDensityComponents
        , xval, obs, sdObs
    ){
        predicted <- simpleModel(theta, xval)
        logDen <- c(obs= -1/2 * sum( ((predicted-obs)/sdObs)^2 ))
    }
    theta0 = c(a=9,b=4) # initial estimate
    (testLogDenResult <- logDenGaussianSimpleModel(theta0, list(), numeric(1), xval, obs, sdObs ))

With additional argument logDensityComponents, the current value of the result components is provided. With argument intermediates intermediate results are handled efficiently, that several blocks depend on. In this example, we do not make use of those.

The logDensity function can return more than one component, i.e. a named numeric vector. This allows weighting their importance within the sum during the sampling. This can be done by specifying a vector different T0 and TEnd, i.e. 1/weight, for all components of the logDensities when sampling by setupAndSample(PopulationSampler).

Setting up the sampler and perform the sampling

In this example, we specify only one parameter block that is updated by a MetropolisBlockUpdater:

blocks <- list(
    m1 = blockSpec(,,   # empty defaults to all parameters 
                new("MetropolisBlockUpdater",
                    fLogDensity=logDenGaussianSimpleModel,
                    argsFLogDensity=list(xval=xval, obs=obs, sdObs=sdObs),
                    logDensityComponentNames = names(testLogDenResult)
                )
    )
)

In order to set up the MetropolisBlockUpdater, we provide the logDensity function. The logDenGaussianSimpleModel model function takes further arguments in addition to the parameter vector and the vector of already calculated logDensityComponents. These must be provided with list argsFLogDensity. In addition, we already need to specify the names of the components of the logDensity function.

In addition to the block specifications, we must provide an initial parameter estimate (theta0 specified already above) together with a conservatively broad spread. That will be used to set up the initial population of parameter vectors.

covarTheta0 <- diag(c(0.8,0.6)^2)

With all these ingredients we are ready to construct the sampler and initiate the sampling.

sampler <- newPopulationSampler( blocks, theta0, covarTheta0 )
sampler <- setupAndSample(sampler, nSample=20L)

When calling setupAndSample, one can specify the number of samples, thinning intervals and also the temperatures used in the simulated annealing. Here the default of temperature 1 is used, i.e. standard Markov Chain without simulated annealing.

Inspecting the sample

The sampleLogs object is obtained from the sampler. There are many functions to ask this object for sampling properties, e.g. getNSamplePopulations. A SampleLogs object can be inspected and plotted using the methods from the coda package. And it can be transformed, e.g. by functions subsetTail or squeeze.

    sampleLogs <- getSampleLogs(sampler)
    getNSamplePopulations(sampleLogs)
    mcmc <- asMcmc(sampleLogs)
    plot( mcmc, smooth=FALSE )
    sampleLogsTail <- subsetTail(sampleLogs, 0.5) # omit the burnin
    gelman.diag(asMcmc(sampleLogsTail))

In the example, the few samples are not sufficient to give a good convergence. Hence we tell the sampler to continue the sampling.

    sampler <- setupAndSample(sampler, nSample=30L)

And inspect it again:

    sampleLogs <- getSampleLogs(sampler)
    getNSamplePopulations(sampleLogs)
    mcmc <- asMcmc(sampleLogs)
    plot( mcmc, smooth=FALSE )
    sampleLogsTail <- subsetTail(sampleLogs, 0.8) # omit the burnin
    gelman.diag(asMcmc(sampleLogsTail))

Based on values of the samples, arbitrary statistics can be calculated. The dealing with multiple chains can be simplified by aggregating samples of all chains within each population into one big chain by stackChainsInPopulation.

   stacked <- stackChainsInPopulation(sampleLogsTail)
   iPop <- 1L
   #iPop <- 2L
   parms <- getParametersForPopulation(stacked, iPop)[,,1L]
   rank <- computeSampleRanksForPopulation(stacked, iPop)   
   (thetaBest <- parms[,rank[1L]])  
   (thetaMedian <- apply( parms,1L, median ))
   (cf95 <- apply(parms, 1L,  quantile, probs = c(0.025, 0.975)))