In blockDemac: parallel DEMC with several parameter blocks

library(knitr)
opts_chunk$set(out.extra='style="display:block; margin: auto"'
    #, fig.align="center"
    , fig.width=4.3, fig.height=3.2, dev.args=list(pointsize=10)
    )
knit_hooks$set(spar = function(before, options, envir) {
    if (before){
        par( las=1 )                   #also y axis labels horizontal
        par(mar=c(2.0,3.3,0,0)+0.3 )  #margins
        par(tck=0.02 )                          #axe-tick length inside plots             
        par(mgp=c(1.1,0.2,0) )  #positioning of axis title, axis labels, axis
     }
})

Using parameter blocks

This vignette details the sampling of different subsets, i.e. blocks, of the parameter vector by different log-Density functions. The background is described in Wutzler & Carvalhais 2014 (doi: 10.1002/2014jg002650).

if(!exists("isDevelopMode")) library(blockDemac)
set.seed(0815)      # for reproducable results

The example model and data

In the model two streams of predictions are generated, each using the same two parameters. The variance in the first stream is dominated by the first parameter a, while the variance in the second stream is dominated by the second parameter b.

modTwTwoDenEx1 <- function(
        theta           ##<< model parameters a and b
        , xSparse       ##<< numeric vector of Sparse input/output relationship
        , xRich         ##<< numeric vector of rich input/output relationship
        , thresholdCovar=0      ##<< model structural deficiency
){
    list(  y1 = as.numeric(theta[1]*xSparse + theta[2]*mean(xRich)/10)
           ,y2 = as.numeric(theta[1]*xSparse[1] + theta[2]*pmax(0,xRich-thresholdCovar) ))
}

The synthetic data is generated by running the model. The second stream,y2, with covariate xRich has 100 times as many records as the first stream,y1, with covariate xSparse.

xSparse <- c(1,runif(9,min=0.5,max=1.5))  # only 10 observations
xRich <- runif(1000,min=.7,max=1)    # 1000 observations    
thetaTrue <- c( a=1, b=2 )
theta0 <- thetaTrue
covarTheta0 <- diag(0.2*thetaTrue)
obsTrue <- modTwTwoDenEx1( thetaTrue, xSparse=xSparse,xRich=xRich
        , thresholdCovar=0.3)
sdObsTrue <- sdObs <- list( 
        y1=mean(obsTrue$y1)*0.06
        ,y2=mean(obsTrue$y2)*0.02 )
obs <- list(
        y1 = obsTrue$y1 + rnorm(length(obsTrue$y1),sd=sdObsTrue$y1) 
        ,y2 = obsTrue$y2 + rnorm(length(obsTrue$y2),sd=sdObsTrue$y2) )

LogDensity functions

For simplicity we assume no prior and Gaussian distribution of model-data residuals. We specify one probability density function for the likelihood of the sparse data stream and another for function for the likelihood of the rich data stream.

denSparse <- function( theta, intermediates, logDensityComponents
    ,xSparse, xRich, obs, sdObs, ...
){
    pred <- modTwTwoDenEx1(theta, xSparse=xSparse, xRich=xRich, ...) 
    misfit <- obs$y1 - pred$y1
    structure( -1/2 * sum((misfit/sdObs$y1)^2), names='y1' )
}
logDensityComponentsSparse0 <- denSparse( theta0, list(), numeric(0), xSparse, xRich, obs, sdObsTrue )

denRich <- function( theta, intermediates, logDensityComponents
        ,xSparse, xRich, obs, sdObs, ...
){
    pred <- modTwTwoDenEx1(theta, xSparse=xSparse, xRich=xRich, ...) 
    misfit <- obs$y2 - pred$y2
    structure( -1/2 * sum((misfit/sdObs$y2)^2), names='y2' )
}
logDensityComponentsRich0 <- denRich( theta0, list(), numeric(0), xSparse, xRich, obs, sdObsTrue )

Both densities rely on the same forward model run pred <- modTwTwoDenEx1(theta...). If such a run is expensive, one can make both densities depend on the same intermediate and save some of the runs, as described in the vignette for Sampling observation uncertainty.

Specifying the parameter blocks

Since variation in the sparse data stream is dominated by parameter a we let this parameter be updated by likelihood of the sparse data stream. Similarly, block for parameter b is updated by the likelihood of the rich data stream.

argsFLogDensity = list(xSparse=xSparse, xRich=xRich, obs=obs, sdObs=sdObs)
sparseBlockSpec <- blockSpec("a",names(theta0),
        new("MetropolisBlockUpdater",
                fLogDensity = denSparse,
                argsFLogDensity = argsFLogDensity,      
                logDensityComponentNames = names(logDensityComponentsSparse0)
        )
)
richBlockSpec <- blockSpec("b",names(theta0),
        new("MetropolisBlockUpdater",
                fLogDensity = denRich,
                argsFLogDensity = argsFLogDensity,      
                logDensityComponentNames = names(logDensityComponentsRich0)
        )
)
blockSpecs = list( sparse=sparseBlockSpec, rich=richBlockSpec )

Sampling the posterior

Invoking the with several blocks, is the same as with using one single block, as described in the vignette on the simple Model.

sampler <- newPopulationSampler( blockSpecs, theta0, covarTheta0, nPopulation=1L )
sampler <- setupAndSample(sampler, nSample=60L)
sampleLogs <- getSampleLogs(sampler)
plot( asMcmc(sampleLogs), smooth=FALSE )

stacked <- stackChainsInPopulation(subsetTail(getSampleLogs(sampler),0.8))
sample1 <- t(getParametersForPopulation(stacked, 1L)[,,1L])
summary(sample1)

What are the best parameter estimates?

The two data streams are conflicting giving the somewhat biased model. Therefore, the two pdfs dSparse and dRich peak at different locations in parameter space. The color code of the following figure ranks the samples by different criteria.

 # append the within block sums across temperated logDensityComponents 
ss <- cbind( sample1,  t(getLogDensitiesForPopulation(stacked,1L)[,,1L]))
oldPar <- par(mfrow=c(1,5), par(mar=c(2.0,3.3,2.0,0)+0.3 ) )
colh <- heat.colors(nrow(ss))
plot( b ~ a, data=as.data.frame(ss[ rev(order(ss[,"sparse"])),])
    , col=colh, main="rank sparse" )
plot( b ~ a, data=as.data.frame(ss[ rev(order(ss[,"rich"])),])
    , col=colh, main="rank rich" )
plot( b ~ a, data=as.data.frame(ss[ rev(order(rowSums(ss[,c("sparse","rich")]))),])
    , col=colh, main="rank sum" )
plot( b ~ a, data=as.data.frame(ss[ rev(order(rowSums(apply(ss[,c("sparse","rich")],2,rank)) )),])
    , col=colh, main="sum ranks" )
ranks <- computeSampleRanksForPopulation(stacked, 1L)[,1L]    
plot( b ~ a, data=as.data.frame(ss[ranks,])
    , col=colh, main="min ranks" )

The right plot shows that the sparse data stream effectively constrains parameter a. The second plot shows that the rich data stream constrains a combination of both parameters with best parameters having very low values of a.
The third plot shows that the influcence of the sparse data stream is negligible in the sum of the two densities. Only with the last two plots, the color code roughly matches the density of the sample.

Instead of ranking the sum of the densities, the sum or the minimum of both density ranks is a good criterion for determining the best parameters based on given set of densities. The last criterion is implemented in function computeSampleRanksForPopulation.