In mathieucarmassi/calibrationcode: Code Calibration in a Bayesian Framework

knitr::opts_chunk$set(
  echo=TRUE, cache=FALSE)
if (Sys.info()[['sysname']]=="Windows")
{
    windowsFonts(Times=windowsFont("TT Times New Roman"))
}

The package CaliCo allows to rule a Bayesian calibration on every kind of code ("black box" or analytic codes). The aim of the Bayesian calibration is to better asses the uncertainty on the parameters of the code. In CaliCo, the code to calibrate has to be defined with two kinds of inputs. The first concerns the forced variables and the second the parameters. $$ f:\begin{align} \mathbb{R}^2 & \mapsto \mathbb{R} \ (x,\theta) &\rightarrow y \end{align} $$ If the code takes several forced variables and parameters, vectors can be used. $$ f:\begin{align} \mathbb{R}^d\times\mathbb{R}^p & \mapsto \mathbb{R} \ (\boldsymbol{x},\boldsymbol{\theta}) &\rightarrow y \end{align} $$

When design matrices are used (especially for the forced variables), the matrix can be implemented straightforwardly

$$ f:\begin{align} \mathcal{M}_{n,d}\times\mathbb{R}^p & \mapsto \mathbb{R}^n \ (\boldsymbol{X},\boldsymbol{\theta}) &\rightarrow \boldsymbol{y} \end{align} $$

Let us consider, for this introduction, a dynamic code with one parameter defined as $$ f_t:\begin{align} \mathbb{R}^2 & \mapsto \mathbb{R} \ (t,\theta) &\rightarrow (6t-2)^2sin(\theta t-4) \end{align} $$

The function $f_t$ is chosen with only one forced variable and one parameter $\theta$. Let us say that experiments are available and can be expressed by $Y_exp$. The belief on the value of $\theta$ shows that a non-negligible gap is present.

library(ggplot2)
n=10
X <- seq(0,1,length.out=n)
code <- function(X,theta)
{
  return((6*X-2)^2*sin(theta*X-4))
}
Yexp <- code(X,10.5)+rnorm(n,0,sqrt(0.01))
Yt <- code(X,11)
gdata <- data.frame(y=Yexp,x=X,type="Experiments")
gdata2 <- data.frame(y=Yt,x=X,type="Theoretical results")
gdata <- rbind(gdata,gdata2)

p1 <- ggplot(gdata,aes(x = x,y=y, col=type))+geom_line() + theme_light() + theme(legend.title = element_blank(),
                                  legend.position = c(0.5,0.8),
                                  legend.background = element_rect(linetype="solid", colour ="grey"),
                                  axis.title=element_text(size=20,family = "Helvetica",face = "italic"),
                                  axis.text = element_text(size=20),
                                  legend.text = element_text(size=10))
p1

The aim of calibration is to better asses the uncertainty on $\theta$ and access the variance of the output from the updated distribution. Several statistical model are available to realize such a study. The need to introduce four statistical models lies in the fact that one could encounter two major difficulities in code calibration. The first is when the code is time consuming and the second is when the code creates a discrepancy. That is why, the choice of the statistical model depends on each particular case [see @Carmassi2018]. The four statistical models implemented are detailed below.

The first model available is: $$\mathcal{M}1:\forall i \in [1,\dots,n] \ Y{exp_i}=f(\boldsymbol{x_i},\boldsymbol{\theta})+\epsilon_i$$ where $Y_{exp_i}$ stands for the $i^{th}$ from $n$ observations, $\boldsymbol{x_i}$ for the vector of controlled variables corresponding, and $\epsilon_i$ for the measurement error. In CaliCo, $\epsilon$ will always be defined as a white Gaussian noise with $\epsilon \overset{iid}{\sim}\mathcal{N}(0,\sigma_{err}^2)$. $\sigma_{err}^2$ stands for the variance of the measurement error and has to be found as much as $\boldsymbol{\theta}$.

The second model intervienes when the code is too long to run. In that case: $$\mathcal{M}2:\forall i \in [1,\dots,n] \ Y{exp_i}=\boldsymbol{F}(\boldsymbol{x_i},\boldsymbol{\theta})+\epsilon_i$$ where $\boldsymbol{F}({\bullet,\bullet})\sim\mathcal{PG}(m({\bullet,\bullet}),c({\bullet,\bullet},{\bullet,\bullet}))$ is a Gaussian process defined for an expectancy $m$ and covariance $c$ functions.

The third model lies on $\mathcal{M}1$ in the way that we consider another error term called discrepancy. $$\mathcal{M}_3:\forall i \in [1,\dots,n] \ Y{exp_i}=f(\boldsymbol{x_i},\boldsymbol{\theta})+\delta(\boldsymbol{x_i})+\epsilon_i$$ where $\delta(\boldsymbol{x_i})\sim\mathcal{PG}(m(\bullet),c(\bullet,\bullet))$ is a Gaussian process quantifying the code error (or discrepancy).

In CaliCo, $m(\bullet)$ in discrepancy is set to zero for now.

Similarly, the fourth model is defined from $\mathcal{M}2$ by adding the discrepancy: $$\mathcal{M}_4:\forall i \in [1,\dots,n] \ Y{exp_i}=F(\boldsymbol{x_i},\boldsymbol{\theta})+\delta(\boldsymbol{x_i})+\epsilon_i$$

To run a Bayesian calibration in CaliCo, the statistical model must be chosen first. Then the prior densities of the parameters has to be set up as well. Then the calibration can be executed by the function calibrate.

Define the statistical model

As detailed before, there is four models. Two of them use a Gaussian process to emulate the code. The users is free to control the parameters of the estimation which is realized by the DiceKrigging package (see [@Roustant2012]).

Let us consider, the set of experiments generated for $\theta=10.9$ and a measurement error of $\sigma_{err}^2=0.01$. Let us also consider that the prior belief on $\theta$ is that $\theta \sim \mathcal{N}(11,0.01)$ and $\sigma_{err}^2\sim\Gamma(1,0.01)$.

theta.prior <- rnorm(100,11,sqrt(.5))
sigma.prior <- 0.01

Y_prior <- matrix(nr=100,nc=n)
for (i in 1:100){
  Y_prior[i,] <- code(X,theta.prior[i])+rnorm(n,0,sqrt(sigma.prior))
}

qq <- apply(Y_prior,2,quantile,probs=c(0.05,0.95))
gdata <- data.frame(y=Yexp,x=X,upper=qq[2,],lower=qq[1,],type="Experiments",fill="credibility interval a priori at 90%")
gdata2 <- data.frame(y=Yt,x=X,upper=qq[2,],lower=qq[1,],type="Prior belief",fill="credibility interval a priori at 90%")
gdata <- rbind(gdata,gdata2)

p2 <- ggplot(gdata)+geom_line(aes(x=x,y=y,col=type))+geom_ribbon(aes(x=x,ymax=upper,ymin=lower,fill=fill),alpha=0.3)+theme_light()+theme(legend.title = element_blank(),
                                  legend.position = c(0.3,0.8),
                                  legend.background = element_rect(linetype="solid", colour ="grey"),
                                  axis.title=element_text(size=20,family = "Helvetica",face = "italic"),
                                  axis.text = element_text(size=20),
                                  legend.text = element_text(size=10))
p2

After $100$ realizations of the prior densities, the credibility interval a priori at 90\% is wide and it can be overly imprecise for one purpose.

Model 1

To implement $\mathcal{M}_1$ in CaliCo, it is only necessary to:

library(CaliCo)
# Number of experiments
n=10
# Time interval
t <- seq(0,1,length.out=n)
# Code definition
code <- function(t,theta)
{
  return((6*t-2)^2*sin(theta*t-4))
}
# Generate the experiment
Yexp <- code(t,10.5)+rnorm(n,0,sqrt(0.01))
# Generate the first model
model1 <- model(code=code,X=t,Yexp=Yexp,model="model1")

The function model creates a model.class object which is a R6 object. All the fields are accessible from the object created by the operator $ but two main methods are implemented, are print and plot, are usable as functions.

print(model1)

To access graphical visualization of the model, parameter values need to be added. The pipe \%<\% defined is CaliCo allows to load parameter values into a statistical model. The argument of the pipe is a list which has for name theta, thetaD and var. These values stand for, respectively, the input parameter vector, the vector composed of the parameter of the discrepancy (only for $\mathcal{M}_3$ and $\mathcal{M}_4$, not necessary for $\mathcal{M}_1$ or $\mathcal{M}_2$) wich are the variance of the covariance function and the correlation lenght, and the variance of the measumrement error. The procedure for $\mathcal{M}_1$ is done as the following code line.

model1 %<% list(theta=11,var=0.01)

A Warning message appears everytime to sensitize the user to be careful with the size of the paramerter vector. Then to display the results of the model, the function plot can be used straightforwardly. The first argument of the function is the model, one wants to plot and the second argument is the x-axis. These two arguments are required to get the graph output. An additional argument, CI, allows to control the credibility interval one wants to see. By default CI="all", but if CI="err" only the $95\%$ CI of the measurment error with or not the discrepancy is given. Similarly, for $\mathcal{M}_2$ and $\mathcal{M}_4$ if CI="GP" only the $95\%$ CI of the Gaussian process is shown.

The plot requires some inputs (\emph{i.e.} plot(model,theta,var)) where theta is the parameter value or vector and var the variance of the white Gaussian noise which represents the measurmement error.

plot(model1,t)

Note that the method plot generates a ggplot. It is possible to store the plot into a variable and make some changes. For example, if one is interested in adding a title, the labels, decrease the size of the legend and move it a little:

library(ggplot2)
p <- plot(model1,t)
p+ggtitle("Model1 and experiments")+ylab("y")+xlab("t")+
  theme(legend.position=c(0.3,0.75),legend.text=element_text(size = '7'))

Model 2

The second model is usefull when the code is time consuming. The package CaliCo offers a flexibility from this point with three possibilities. The first, one possesses the code explicitely in R with the same features than before (note that if the code has no input variables, the function needs to be defined indentically as before but in the function model only set X=0) but without any design of experiments (DOE). In that cas, the function model creates a new maximin Latin Hypercube Space (LHS) and evaluates the code at the locations selected by the DOE. The options to establish such a DOE are controlled by the user with the option opt.emul. If one possesses the code and a specific DOE, it is also possible to give both of them at the function model. Then, if no code is available but a DOE with the corresponding outputs of the code are in the user's possession, another option opt.sim allows to build the model on these specifications. Let us detail these three possibilities.

Numerical code without DOE

In this case, one possesses the code $f_t$ and has no DOE. To create a Gaussian process that emulates the code, an option needs to be filled in the function model which is opt.gp. This option is a list of two components:

type the type of the kernel in agreement with km function from DiceKrigging package
DOE specific DOE given by the user (default value NULL).

If this case DOE is let to the NULL value. Then, the option opt.emul is useful to control the building of the DOE and is composed of:

p the number of parameter in the code
n.emul the number of experiments in the DOE (default value 100)
binf the lower bound of the parameter value or vector (default value 0)
bsup the upper bound of the parameter value or vector (default value 1)

The creation of the model is completed by the following code.

model2code <- model(code,X,Yexp,"model2",
                    opt.gp=list(type="matern5_2", DOE=NULL),
                    opt.emul=list(p=1,n.emul=40,binf=8,bsup=13))

The output is the results of the optimization done by DiceKriging for estimating the parameters of the Gaussian process.

Numerical code with DOE

In that case, one is free to generate a DOE appart from the function model. Let us consider, for the following, that the DOE is generated by the package DiceDesign, is a maximin LHS of $30$ points and is called myDOE.

myDOE <- DiceDesign::lhsDesign(30,2)$design
myDOE[,2] <- unscale(myDOE[,2],8,13)

Then, to built the model, the option opt.emul is not needed anymore.

model2doe <- model(code,X,Yexp,"model2",
                   opt.gp=list(type="matern5_2", DOE=myDOE))

DOE with corresponding outputs

In this case, an option is added to opt.gp and allows define a DOE and the corresponding outputs of the code. This option opt.sim is a list of two components:

Ysim the outputs of the code
DOEsim the DOE corresponding

Let us consider that for myDOE all the points are evaluated by $f_t$ and the outputs are collected into a vector called Ys. Then, the model definition is completed with the following code:

Ys <- matrix(nr=30,nc=1)
for (i in 1:30)
{
  Ys[i] <- code(myDOE[i,1],myDOE[i,2])
}

model2 <- model(code=NULL,X,Yexp,"model2",
                opt.gp=list(type="matern5_2", DOE=NULL),
                opt.sim=list(Ysim=Ys,DOEsim=myDOE))

Graphical representation

Similarly as for $\mathcal{M}_1$, parameters has to be added with the pipe \%<\% to access visual representation.

paramList <- list(theta=11,var=0.1)
model2code %<% paramList
model2doe %<% paramList
model2 %<% paramList

plot(model2code,t)
plot(model2doe,t)
plot(model2,t)

Model 3 and Model 4

Both $\mathcal{M}_3$ and $\mathcal{M}_4$ add a discrepancy to respectively $\mathcal{M}_1$ and $\mathcal{M}_2$. Then discrepancy options has to be filled in the model definition. This option is called opt.disc and is a list of only one element kernel.type (further development will be brought in the future to better control the discrepancy by adding an expectancy form). The kernel.type can be:

"gauss" Gaussian
"exp" Exponential
"matern3_2" Matèrn 3/2
"matern5_2" Matèrn 5/2

The user is free to select the form of the covariance structure in the discrepancy. Howerver, the expectancy is set automatically to zero. As an example, for $\mathcal{M}_3$ no surrogate of the code is used then:

model3 <- model(code,X,Yexp,"model3",
                opt.disc=list(kernel.type="matern5_2"))

For $\mathcal{M}_4$, in the third case (where only the DOE and code output are available) the model is generated by:

model4 <- model(code,X,Yexp,"model4",
                opt.gp=list(type="matern5_2", DOE=NULL),
                opt.sim=list(Ysim=Ys,DOEsim=myDOE),
                opt.disc=list(kernel.type="gauss"))

Identically as before, parameters are added. However, the parameters relative to the discrepancy needs to be added. As a matter of fact, these parameters are estimated in Bayesian calibration at the same time than theta and var. That is why a parameter vector thetaD is added and contains $\sigma_{\delta}^2$ and $\psi$ the variance and the correlation length of the Gaussian process of the discrepancy.

model3 %<% list(theta=11,thetaD=c(0.1,0.5),var=0.1)
model4 %<% list(theta=11,thetaD=c(0.1,0.5),var=0.1)

# plot(model3,t)
# plot(model4,t)

Define the prior distributions

In Bayesian calibration a prior distribution is updated in a posterior distribution thanks to the likelihood. The prior distribution, in CaliCo, is defined as the following chunk.

gaussian <- prior(type.prior="gaussian",opt.prior=list(c(0.5,0.001)))
plot(gaussian)

For calibration with $\mathcal{M}_1$ and $\mathcal{M}_2$, only theta and var are calibrated. That means in code calibration of $f_t$ only two prior densities are defined. To define multiple prior densities in CaliCo, one can create a vector of characters for the option type.pior and a list containing the corresponding vectors of the hyperparameters of the distribution. Then in the example:

pr1 <- prior(type.prior=c("gaussian","gamma"),
             opt.prior=list(c(11,0.01),c(1,0.1)))
plot(pr1$Prior1)
plot(pr1$Prior2)

The function prior in this particular case creates a list containing both prior densities. If one is interrested in visualizing the densities, the function plot, used as in the previous code chunck, allow to access the ggplot2 object directly. The list contains prior density in the order set in the prior function. That means for the example given above, Prior1 corresponds to the Gaussian density and Prior2 to the Gamma density (respectively to theta density and var density).

For $\mathcal{M}_3$ and $\mathcal{M}_4$, two additionnal parameters are needed before running calibration. One can define them identically as before.

pr2 <- prior(type.prior=c("gaussian","gamma","unif","gamma"),
             opt.prior=list(c(11,0.01),c(1,2),c(0,1),c(1,0.1)))
plot(pr2$Prior2)
plot(pr2$Prior3)

In CaliCo the order is always defined by theta first, then if there is a discrepancy thetaD and then the variance of the measurment errors var.

Run Bayesian calibration

Let us recall that if $\mathcal{M}1$ or $\mathcal{M}_2$ are selected only $\theta$ and $\sigma{err}^2$ need to be calibrated. If $\mathcal{M}3$ or $\mathcal{M}_4$ are chosen $\theta$, $\theta{\delta}$ and $\sigma_{err}^2$ need to be calibrated.

In CaliCo, estimation options have to be defined to run the calibration (opt.estim). It encompasses a list of the options to run the Monte Carlo Markov Chain (MCMC):

Ngibbs the number of Gibbs in the Metropolis within Gibbs algorithm
Nmh the number of Metropolis Hastings in the Metropolis within Gibbs algorithm
thetaInit the starting point
r the tuning vector of k which regulate the covariance matrix in the proposition distribution($k\Sigma$). Each $100$ iteration in the MCMC, $k = k(1-r)$ is realized if the acceptation rate is lower than $20\%$ and $k = k(1+r)$ if the acceptation rate is higher than $50\%$.
sig the covariance matrix in the symetric proposition distribution
Nchains the number of MCMC chains (if Nchains>1 an output called mcmc is a coda object and can be manipulated to run a Gelman Rubin diagnostic for example)
burnIn the burn-in to take of the final sample set

Between $\mathcal{M}_1$, $\mathcal{M}_2$ and $\mathcal{M}_3$ and $\mathcal{M}_4$, opt.estim is not the same because of the size of thetaInit and sig. Indeed, with the discrepancy, the number of parameter to calibrate had increased of two. The parameter r can remain the same between the models.

The function calibrate is the function that run the Bayesian calibration. It takes three inputs:

md the model.class object generated by the function model
pr the prior.class object generated by the function prior
opt.estim the estimation options

mdfit1 <- calibrate(model1,pr1,
                    opt.estim = list(Ngibbs=2000,Nmh=5000,thetaInit=c(11,0.01),
                                     r=c(0.05,0.05),sig=diag(4),Nchains=1,burnIn=2000))

One can easily access to a summary of the results by running a print(mdfit1).

print(mdfit1)

The summary recalls the function used, the selected model but indicates the rates of the algorithms. Then, the Maximum A Posteriori (MAP) and the mean a posteriori are also plotted.

One can also have access to a visual representation of the results proposed by CaliCo. With the plot function, one can display a serie of graphs. Similarly, as the plot function used to display the model, the second argument is required and is the x-axis.

plot(mdfit1,t)

With only one parameter to calibrate in theta, the Warning indicates that no correlation graph is available. As a matter of fact, a second group of graphs in given with the previous ones which is correlation plot between each parameters theta with the corresponding densities a priori and a posteriori. One can control this output with an additionnal option graph:

if graph="chains": only the table of the autocorrelation, chains points and densities a priori and a posteriori is produced.
if graph="corr": only the table of the correlation graph between each parameter is displayed.
if graph="result": only the result on the quantity of interest is given..
if graph=NULL: no graphs are produced automatically.

The serie of graphs is just a display proposition by CaliCo. One can elect the specific graph he want to plot. All the graph are ggplot2 objects and are generated in a list. For example, if one runs p <- plot(mdfit1,t), p is a list containing several elements:

ACF a list of all autocorrelation graphs in the chains for each variable
MCMC a list of all the MCMC chains for each variable
corrplot a list of all correlation graphs between each variable
dens a list of all density a priori and a posteriori graphs for each variable
out the result on the quantity of interest

These list of graphs are also only propositions. One is free to load the generated MCMC chain (without the burn-in sample) and produces the graphical representation he wants. This operation is done by the function chain:

mcmc <- chain(mdfit1)
head(mcmc)

To also quickly get the estimators of the chain, the function estimators return the mean a posteriori and the MAP:

Estimators <- estimators(mdfit1)
print(Estimators)

The calibration procedure is identical for $\mathcal{M}_2$. However, for $\mathcal{M}_3$ and $\mathcal{M}_4$ more parameters are to calibrate. That means the option opt.estim needs to be adapted. For example, in $\mathcal{M}_4$:

mdfit4 <- calibrate(model4,pr2,
                    opt.estim = list(Ngibbs=1000,Nmh=3000,thetaInit=c(11,2,0.5,0.01),
                                      r=c(0.3,0.3),sig=diag(4),Nchains=1,burnIn=1000))

The computing time for calibrating $\mathcal{M}_3$ and $\mathcal{M}_4$ is much longer than calibrating $\mathcal{M}_1$ and $\mathcal{M}_2$. The matrix inversion in the model's likelihood is reponsible of the increase of time. Indeed, from $\mathcal{M}_1$ to $\mathcal{M}_4$ the complexity of the covariance matrix in the likelihood increases.

From this point, when calibration is done, all the functions used above: estimators, chain, plotand print are accessible.

Run Bayesian forecasting

CaliCo also allows the user to access a prediction on a new data set regarding the already run calibration. In the example, let us consider that one is interested in visualizing what happen on $[1,1.5]$ interval. The function forecast simply gets two inputs which are the calibrated object mdfit and the new data set. In the example if x.new = seq(1,1.5,length.out=10) then the argument in the function forecast is x=x.new.

x.new <- seq(1,1.5,length.out=10)
fr <- forecast(mdfit1,x.new)

The function plot gives a visual representation on the predicted values for the chosen statistical model. The x-axis given has to be the original x-axis used for calibration extended with the x-axis corresponding to the new data set.

plot(fr,c(t,x.new))

The blue line corresponds to the predicted values and the credibility intervals at $95\%$ are also given depending on the chosen model.

Conclusion

This vignette illustrates the use of CaliCo in a simple example. Although, for a multidimensional test, the reader can try out the examples given in the help ?model of ?calibrate. The aim of the package is to be adaptive to each industrial situation and makes Bayesian calibration accessible.

References

mathieucarmassi/calibrationcode documentation built on Aug. 14, 2019, 12:35 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

mathieucarmassi/calibrationcode
Code Calibration in a Bayesian Framework

In mathieucarmassi/calibrationcode: Code Calibration in a Bayesian Framework

Define the statistical model

Model 1

Model 2

Numerical code without DOE

Numerical code with DOE

DOE with corresponding outputs

Graphical representation

Model 3 and Model 4

Define the prior distributions

Run Bayesian calibration

Run Bayesian forecasting

Conclusion

References

R Package Documentation

Browse R Packages

We want your feedback!

mathieucarmassi/calibrationcode Code Calibration in a Bayesian Framework

In mathieucarmassi/calibrationcode: Code Calibration in a Bayesian Framework

Define the statistical model

Model 1

Model 2

Numerical code without DOE

Numerical code with DOE

DOE with corresponding outputs

Graphical representation

Model 3 and Model 4

Define the prior distributions

Run Bayesian calibration

Run Bayesian forecasting

Conclusion

References

R Package Documentation

Browse R Packages

We want your feedback!

mathieucarmassi/calibrationcode
Code Calibration in a Bayesian Framework