Exploring the behavior and ifluence of the prior in the tgp package. It appears there may be a bug (or rather incorrect documentation) in how the mixed Gamma priors are defined for the nugget nug and range / length-scale d.

opts_knit$set(upload.fun = socialR::flickr.url)
opts_chunk$set(dev.args=list(bg="transparent"), comment=NA, tidy=FALSE, message=FALSE)
library(MCMCpack)
library(tgp)
library(reshape2)
library(ggplot2)

Consider some very simple data:

X = seq(-5, 5, length=50)
x = c(-4, -3, -1,  0,  2)
y = c(-2,  0,  1,  2, -1)

Priors

We choose prior parameters that give us very strong priors around particular values.

s2.p <- c(50,50) # InvGamma, approaches Gaussian approaching delta-fn around 1
tau2.p <- c(20,1) # InvGamma, approaches exponential approaching delta-fn around 0
d.p = c(10, 0.01, 10, 0.01) # Mixed Gammas, approaches Gaussian around 0.1, with var .001
nug.p = c(10, 0.1, 10, 0.1) # Mixed Gammas, approaches Gaussian around 1, with var .1

Visualize the priors on lengthscale and nugget:

d_prior <- function(x) dgamma(x, d.p[1], scale = d.p[2]) + dgamma(x, d.p[3], scale = d.p[4])
nug_prior <- function(x) dgamma(x, nug.p[1], scale = nug.p[2]) + dgamma(x, nug.p[3], scale = nug.p[4])
xx <- seq(.0001, 2, length.out=100)
priors <- data.frame(x = xx, nug = nug_prior(xx), d = d_prior(xx))
priors <- melt(priors, id="x")
ggplot(priors) + geom_line(aes(x, value)) + facet_wrap(~variable, scale="free")

Solve the gp with these priors:

gp <- bgp(X=x, XX=X, Z=y, verb=0,
          meanfn="constant", bprior="b0", BTE=c(1,2000,1), m0r1=FALSE, 
          corr="exp", trace=TRUE, beta = 0,
          s2.p = s2.p, d.p = d.p, nug.p = nug.p,
          s2.lam = "fixed", d.lam = "fixed", nug.lam = "fixed", 
          tau2.lam = "fixed", tau2.p = tau2.p)

Extract the posterior Gaussian process mean and the $\pm 2$ standard deviations over the predicted grid from the fit:

V <- gp$ZZ.ks2
tgp_dat <- data.frame(x   = gp$XX[[1]], 
                  y   = gp$ZZ.km, 
                 ymin = gp$ZZ.km - 1.96 * sqrt(gp$ZZ.ks2), 
                 ymax = gp$ZZ.km + 1.96 * sqrt(gp$ZZ.ks2))

For comparison, let us manually estimate the GP with hyperparameters fixed at the mean value of the priors.

d = .1; epsilon = .1; sigma = 1; #fixed hyperparamaters
SE <- function(Xi,Xj, d) sigma * exp(- (Xi - Xj) ^ 2 / d)
cov <- function(X, Y) outer(X, Y, SE, d) 
cov_xx_inv <- solve(cov(x, x) + epsilon * diag(1, length(x)))
Ef <- cov(X, x) %*% cov_xx_inv %*% y
Cf <- cov(X, X) - cov(X, x)  %*% cov_xx_inv %*% cov(x, X)
manual_dat <- data.frame(x = X, 
                         y = Ef, 
                         ymin = (Ef - 1.96 * sqrt(diag(Cf))), 
                         ymax = (Ef + 1.96 * sqrt(diag(Cf))))

Compare the GP posteriors in a plot:

ggplot(tgp_dat) +
    geom_ribbon(aes(x, y, ymin = ymin, ymax = ymax), fill="red", alpha = .1) + # Var
    geom_line(aes(x, y), col="red") + # mean
    geom_point(data = data.frame(x = x, y = y), aes(x, y)) + # raw data
    geom_ribbon(dat = manual_dat, aes(x, y, ymin = ymin, ymax = ymax), fill = "blue", alpha = .1) + # Var
    geom_line(dat = manual_dat, aes(x, y), col = "blue")  + #MEAN    
    theme_bw() + theme(plot.background = element_rect(fill = "transparent",colour = NA))

Wait a minute. The Bayesian estimate of the lengthscale seems way too long, given our strongly informative prior. Likewise the measurement uncertainty appears to have induced a rather broad distribution. Let's look at the posteriors:

require(reshape2)
hyperparameters <- c("index", "s2", "tau2", "beta0", "nug", "d", "ldetK")
posteriors <- melt(gp$trace$XX[[1]][,hyperparameters], id="index")
ggplot(posteriors) + geom_histogram(aes(value)) + facet_wrap(~variable, scales="free")

Hmm... posteriors indeed show a huge value for the length scale d, and for the nugget nug. Looks supiciously like the mixed Gamma prior when the scale parameter is used as the rate parameter (it's inverse):

Compare to priors with scale misplaced as rate in d, nug

d_prior <- function(x) dgamma(x, d.p[1], d.p[2]) + dgamma(x, d.p[3], d.p[4])
nug_prior <- function(x) dgamma(x, nug.p[1], nug.p[2]) + dgamma(x, nug.p[3], nug.p[4])
xx <- seq(.0001, 2500, length.out=1000)
priors <- data.frame(x = xx, nug = nug_prior(xx), d = d_prior(xx))
priors <- melt(priors, id="x")
ggplot(priors) + geom_line(aes(x, value)) + facet_wrap(~variable, scale="free")


cboettig/nonparametric-bayes documentation built on May 13, 2019, 2:09 p.m.