rcalibration_MS: Setting up the robust Calibration model for multiple sources...

View source: R/rcalibration_MS.R

rcalibration_MSR Documentation

Setting up the robust Calibration model for multiple sources data

Description

Setting up the Calibration model for estimating the parameters via MCMC for multiple sources.

Usage

  rcalibration_MS(design, observations, p_theta=NULL, index_theta=NULL,
                  X=as.list(rep(0,length(design))), 
                  have_trend=rep(FALSE,length(design)),
                  simul_type=rep(1, length(design)),
                  input_simul=NULL, output_simul=NULL,
                  simul_nug=rep(FALSE,length(design)),loc_index_emulator=NULL,
                  math_model=NULL,
                  theta_range=NULL, sd_proposal_theta=rep(0.05,p_theta),
                  sd_proposal_cov_par=NULL,
                  S=10000,S_0=2000, thinning=1,measurement_bias=FALSE, 
                  shared_design=NULL,have_measurement_bias_recorded=F,
                            shared_X=0,have_shared_trend=FALSE,
                  discrepancy_type=rep('S-GaSP',length(design)+measurement_bias),
                  kernel_type=rep('matern_5_2',length(design)+measurement_bias),
                  lambda_z=as.list(rep(NA,length(design)+measurement_bias)),
                  a=NULL,b=NULL,alpha=NULL,
                  output_weights=NULL,...)
 
      

Arguments

design

a list of observed inputs from multiple sources. Each element of the list is a matrix of observable inputs, where each row is a row vector of observable inputs corresponding to one observation and the number of field or experimental data is the total number of rows.

observations

a list of experimental data from multiple sources. Each element is a vector of observations.

index_theta

a list of vectors for the index of calibration parameter contained in each source.

p_theta

an integer about the number of parameters, which should be specified by the user.

X

a list of matrices of the mean/trend discrepancy between the reality and math model for multiple sources.

have_trend

a vector of bool value meaning whether we assume a mean/trend discrepancy function.

simul_type

a vector of integer about the math model/simulator for multiple sources. If the simul_type is 0, it means we use RobustGaSP R package to build an emulator for emulation. If the simul_type is 1, it means the function of the math model is given by the user. When simul_type is 2 or 3, the mathematical model is the geophyiscal model for Kilauea Volcano. If the simul_type is 2, it means it is for the ascending mode InSAR data; if the simul_type is 3, it means it is for the descending mode InSAR data.

input_simul

a list of matices, each having dimension D x (p_x+p_theta) being the design for emulating the math model. It is only useful if the ith value of simul_type is 0 for the ith source, meaning that we emulate the output of the math model.

output_simul

a list of vectors, each having dimension D x 1 being the math model outputs on the design (input_simul). It is only useful if the ith value of simul_type is 0 for the ith source, meaning that we emulate the output of the math model.

simul_nug

a vectors of bool values meaning whether we have a nugget for emulating the math model/simulator for this source. If the math model is stochastic, we often need a nugget. If simul_Nug is TRUE, it means we have a nugget for the emulator. If simul_Nug is FALSE, it means we do not have a nugget for the emulator.

loc_index_emulator

a list for location index to output in the ppgasp emulator for computer models with vectorized output.

math_model

a list of functions of the math models provided by the user for multiple sources. It is only useful if simul_type is 1, meaning that we know the math model and it can be computed fast. If the math model is computationally slow, one should set simul_type to be 0 to emulate the math model. If defined, each element of the list is a function of math models, where the first input of the function is a vector of observable inputs and the second input is a vector of calibration parameters. The output of each function is a scalar. Each function corresponds to one source of data.

theta_range

a p_theta x 2 matrix of the range of the calibration parameters. The first column is the lower bound and the second column is the upper bound. It should be specified by the user if the simul_type is 0.

sd_proposal_theta

a vector of the standard deviation of the proposal distribution for the calibration parameters in MCMC.

sd_proposal_cov_par

a list of vectors of the standard deviation of the proposal distribution for range and nugget parameters in MCMC for each source.

S

an integer about about how many posterior samples to run.

S_0

an integer about about the number of burn-in samples.

thinning

the ratio between the number of posterior samples and the number of recorded samples.

measurement_bias

containing measurement bias or not.

shared_design

A matrix for shared design across different sources of data used when measurement bias exists.

have_measurement_bias_recorded

A bool value whether we record measurement bias or not.

shared_X

A matrix of shared trend when measurement bias exists.

have_shared_trend

A bool value whether we have shared trend when measurement bias exist.

discrepancy_type

a vector of characters about the type of the discrepancy for each source. If it is 'no-discrepancy', it means no discrepancy function. If it is 'GaSP', it means the GaSP model for the discrepancy function. If it is 'S-GaSP', it means the S-GaSP model for the discrepancy function.

kernel_type

a vector of characters about the type of kernel for each data source. matern_3_2 and matern_5_2 are Matern kernel with roughness parameter 3/2 and 5/2 respectively. pow_exp is power exponential kernel with roughness parameter alpha. If pow_exp is to be used, one needs to specify its roughness parameter alpha.

lambda_z

a vector numeric values about how close the math model to the reality in squared distance when the S-GaSP model is used for modeling the discrepancy for each source.

a

a vector of the prior parameter for multiple sources.

b

a vector of the prior parameter for multiple sources.

alpha

a list of vectors of roughness parameters in the kernel for multiple sources.

output_weights

a list of vectors of the weights of the outputs for multiple sources.

...

Extra arguments to be passed to the function (not implemented yet).

Value

rcalibration_MS returns an S4 object of class rcalibration_MS (see rcalibration_MS-class).

Author(s)

Mengyang Gu [aut, cre]

Maintainer: Mengyang Gu <mengyang@pstat.ucsb.edu>

References

A. O'Hagan and M. C. Kennedy (2001), Bayesian calibration of computer models, Journal of the Royal Statistical Society: Series B (Statistical Methodology, 63, 425-464.

K. R. Anderson and M. P. Poland (2016), Bayesian estimation of magma supply, storage, and eroption rates using a multiphysical volcano model: Kilauea volcano, 2000-2012.. Eath and Planetary Science Letters, 447, 161-171.

K. R. Anderson and M. P. Poland (2017), Abundant carbon in the mantle beneath Hawaii. Nature Geoscience, 10, 704-708.

M. Gu (2016), Robust Uncertainty Quantification and Scalable Computation for Computer Models with Massive Output, Ph.D. thesis., Duke University.

M. Gu and L. Wang (2017) Scaled Gaussian Stochastic Process for Computer Model Calibration and Prediction. arXiv preprint arXiv:1707.08215.

M. Gu (2018) Jointly Robust Prior for Gaussian Stochastic Process in Emulation, Calibration and Variable Selection . arXiv preprint arXiv:1804.09329.

Examples

#------------------------------------------------------------------------------
# An example for calibrating mathematical models for data from multiple sources
#------------------------------------------------------------------------------

library(RobustCalibration)


##reality
test_funct<-function(x){
  sin(pi*x/2)+2*cos(pi*x/2)
}


##math model from two sources
math_model_source_1<-function(x,theta){
  sin(theta*x) 
}

math_model_source_2<-function(x,theta){
  cos(theta*x) 
}

input1=seq(0,2,2/(10-1))
input2=seq(0,3,3/(15-1))
##
output1=test_funct(input1)+rnorm(length(input1), sd=0.01)
output2=test_funct(input2)+rnorm(length(input2), sd=0.02)

plot(input1, output1)
plot(input2, output2)



design=list()
design[[1]]=as.matrix(input1)
design[[2]]=as.matrix(input2)

observations=list()
observations[[1]]=output1
observations[[2]]=output2


p_theta=1


theta_range=matrix(0,p_theta,2)
theta_range[1,]=c(0, 8)  
simul_type=c(1,1)

math_model=list()

math_model[[1]]=math_model_source_1
math_model[[2]]=math_model_source_2


## calibrating two mathematical models for these two sources
model_sgasp=rcalibration_MS(design=design, observations=observations, p_theta=1,
                            simul_type=simul_type,math_model=math_model,
                            theta_range=theta_range, 
                            S=10000,S_0=2000,
                            discrepancy_type=rep('S-GaSP',length(design)))

plot(model_sgasp@post_theta[,1],type='l')
mean(model_sgasp@post_theta[,1])



RobustCalibration documentation built on Sept. 8, 2023, 5:23 p.m.