Bayesian Nonparametric \& Nonstationary Regression Models
Description
The seven functions described below implement Bayesian regression models of varying complexity: linear model, linear CART, Gaussian process (GP), GP with jumps to the limiting linear model (LLM), treed GP, and treed GP LLM.
Usage
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35  blm(X, Z, XX = NULL, meanfn = "linear", bprior = "bflat",
BTE = c(1000, 4000, 3), R = 1, m0r1 = TRUE, itemps = NULL,
pred.n = TRUE, krige = TRUE, zcov = FALSE, Ds2x = FALSE,
improv = FALSE, sens.p = NULL, trace = FALSE, verb = 1, ...)
btlm(X, Z, XX = NULL, meanfn = "linear", bprior = "bflat",
tree = c(0.5, 2), BTE = c(2000, 7000, 2), R = 1, m0r1 = TRUE,
itemps = NULL, pred.n = TRUE, krige = TRUE, zcov = FALSE,
Ds2x = FALSE, improv = FALSE, sens.p = NULL, trace = FALSE,
verb = 1, ...)
bcart(X, Z, XX = NULL, bprior = "bflat", tree = c(0.5, 2),
BTE = c(2000, 7000, 2), R = 1, m0r1 = TRUE, itemps = NULL,
pred.n = TRUE, krige = TRUE, zcov = FALSE, Ds2x = FALSE,
improv=FALSE, sens.p = NULL, trace = FALSE, verb = 1, ...)
bgp(X, Z, XX = NULL, meanfn = "linear", bprior = "bflat",
corr = "expsep", BTE = c(1000, 4000, 2), R = 1, m0r1 = TRUE,
itemps = NULL, pred.n = TRUE, krige = TRUE, zcov = FALSE,
Ds2x = FALSE, improv = FALSE, sens.p = NULL, nu = 1.5,
trace = FALSE, verb = 1, ...)
bgpllm(X, Z, XX = NULL, meanfn = "linear", bprior = "bflat",
corr = "expsep", gamma=c(10,0.2,0.7), BTE = c(1000, 4000, 2),
R = 1, m0r1 = TRUE, itemps = NULL, pred.n = TRUE,
krige = TRUE, zcov = FALSE, Ds2x = FALSE, improv = FALSE,
sens.p = NULL, nu = 1.5, trace = FALSE, verb = 1, ...)
btgp(X, Z, XX = NULL, meanfn = "linear", bprior = "bflat",
corr = "expsep", tree = c(0.5, 2), BTE = c(2000, 7000, 2),
R = 1, m0r1 = TRUE, linburn = FALSE, itemps = NULL,
pred.n = TRUE, krige = TRUE, zcov = FALSE, Ds2x = FALSE,
improv = FALSE, sens.p = NULL, nu = 1.5, trace = FALSE,
verb = 1, ...)
btgpllm(X, Z, XX = NULL, meanfn = "linear", bprior = "bflat",
corr = "expsep", tree = c(0.5, 2), gamma=c(10,0.2,0.7),
BTE = c(2000, 7000, 2), R = 1, m0r1 = TRUE, linburn = FALSE,
itemps = NULL, pred.n = TRUE, krige = TRUE, zcov = FALSE,
Ds2x = FALSE, improv = FALSE, sens.p = NULL, nu = 1.5,
trace = FALSE, verb = 1, ...)

Arguments
Each of the above functions takes some subset of the following arguments...
X 

Z 
Vector of output responses 
XX 
Optional 
meanfn 
A choice of mean function for the process. When
Z = cbind(rep(1,nrow(X), X)) %*% beta + W(X), where W(X) represents the Gaussian process
part of the model (if present). Otherwise, when
Z = beta0 + W(X). 
bprior 
Linear (beta) prior, default is 
tree 
a 2vector containing the tree process prior parameterization
p(split leaf eta) = alpha*(1+depth(eta))^(beta) automatically giving zero probability to trees
with partitions containing less than 
gamma 
Limiting linear model parameters p(bd)= t1 + exp(g*(t2t1)/(d0.5)) 
corr 
Gaussian process correlation model. Choose between the isotropic
power exponential family ( 
BTE 
3vector of Montecarlo parameters (B)urn in, (T)otal, and (E)very. Predictive samples are saved every E MCMC rounds starting at round B, stopping at T. 
R 
Number of repeats or restarts of 
m0r1 
If 
linburn 
If 
itemps 
Importance tempering (IT) inverse temperature ladder,
or powers to improve mixing. See 
pred.n 

krige 

zcov 
If 
Ds2x 

improv 

sens.p 
Either 
nu 
“beta” functionality: fixed smoothness parameter for
the Matern correlation function; 
trace 

verb 
Level of verbosity of Rconsole print statements: from 0 (none); 1 (default) which shows the “progress meter”; 2 includes an echo of initialization parameters; up to 3 and 4 (max) with more info about successful tree operations 
... 
These ellipses arguments are interpreted as augmentations to the prior specification generated by
You may use these to specify a custom setting of any of default
parameters in the output list 
Details
The functions and their arguments can be categorized by whether or not they use treed partitioning (T), GP models, and jumps to the LLM (or LM)
blm  LM  Linear Model 
btlm  T, LM  Treed Linear Model 
bcart  T  Treed Constant Model 
bgp  GP  GP Regression 
bgpllm  GP, LLM  GP with jumps to the LLM 
btgp  T, GP  treed GP Regression 
btgpllm  T, GP, LLM  treed GP with jumps to the LLM 
Each function implements a special case of the generic function
tgp
which is an interface to C/C++ code for treed Gaussian process
modeling of varying parameterization. Documentation for tgp
has been declared redundant, and has subsequently been removed. To see
how the b*
functions use tgp
simply examine the
function. In the latest version, with the addition of the ellipses
“...” argument, there is nothing that can be done
with the direct tgp
function that cannot also be done with a
b*
function
Only functions in the T (tree) category take the tree
argument;
GP category functions take the corr
argument; and LLM category
functions take the gamma
argument. Nontree class functions omit
the parts
output, see below
bcart
is the same as btlm
except that only the
intercept term in the LM is estimated; the others are zero, thereby
implementing a Bayesian version of the original CART model
The sens.p
argument contains a vector of parameters for
sensitivity analysis. It should be NULL
unless created by the
sens
function. Refer to help(sens)
for details.
If itemps =! NULL
then importance tempering (IT) is performed
to get better mixing. After each restart (when R > 1
) the
observation counts are used to update the pseudoprior. Stochastic
approximation is performed in the first burnin rounds (for BT
rounds, not B
) when c0
and n0
are positive.
Every subsequent burnin after the first restart is for B
rounds in order to settlein after using the observation counts. See
default.itemps
for more details and an example
Please see vignette("tgp")
for a detailed illustration
Value
bgp
returns an object of class "tgp"
.
The function plot.tgp
can be used to help visualize results.
An object of class "tgp"
is a list containing at least the
following components... The parts
output is unique to the T
(tree) category functions. Tree viewing is supported by
tgp.trees
X 
Input argument: 
n 
Number of rows in 
d 
Number of cols in 
Z 
Vector of output responses 
XX 
Input argument: 
nn 
Number of rows in 
BTE 
Input argument: Montecarlo parameters 
R 
Input argument: restarts 
linburn 
Input argument: initialize MCMC with linear CART 
params 

dparams 
Doublerepresentation of model input parameters used by the Ccode 
itemps 

Zp.mean 
Vector of mean predictive estimates at 
Zp.q1 
Vector of 5% predictive quantiles at 
Zp.q2 
Vector of 95% predictive quantiles at 
Zp.q 
Vector of quantile norms 
Zp.s2 
If input 
Zp.km 
Vector of (expected) kriging means at 
Zp.vark 
Vector of posterior variance for kriging surface (no additive noise) at 
Zp.ks2 
Vector of (expected) predictive kriging variances at 
ZZ.mean 
Vector of mean predictive estimates at 
ZZ.q1 
Vector of 5% predictive quantiles at 
ZZ.q2 
Vector of 95% predictive quantiles at 
ZZ.q 
Vector of quantile norms 
ZZ.s2 
If input 
ZpZZ.s2 
If input 
ZZ.km 
Vector of (expected) kriging means at 
ZZ.vark 
Vector of posterior variance for kriging surface (no additive noise) at 
ZZ.ks2 
Vector of (expected) predictive kriging variances at 
Ds2x 
If argument 
improv 
If argument 
response 
Name of response 
parts 
Internal representation of the regions depicted by partitions of the maximum a' posteriori (MAP) tree 
trees 

trace 
If 
ess 
Importance tempering effective sample size (ESS). If
Otherwise the ESS will be lower due to a nonzero coefficient of variation of the calculated importance tempering weights 
sens 
See 
Note
Inputs X, XX, Z
containing NaN, NA
, or Inf
are
discarded with nonfatal warnings
Upon execution, MCMC reports are made every 1,000 rounds to indicate progress
Stationary (nontreed) processes on larger inputs (e.g., X,Z
)
of size greater than 500, *might* be slow in execution, especially on
older machines. Once the C code starts executing, it can be interrupted
in the usual way: either via CtrlC (Unixalikes) or pressing the Stop
button in the RGUI. When this happens, interrupt messages will
indicate which required cleanup measures completed before returning
control to R.
Whereas most of the tgp models will work reasonably well with
little or no change to the default prior specification, GP's with the
"mrexpsep"
correlation imply a very specific relationship between
fine and coarse data, and a careful prior specification is usually
required.
The ranks provided in the second column of the improv
field
of a tgp
object are based on the expectation of a multivariate
improvement that may or may not be raised to a positive integer power.
They can thus differ significantly from a simple ranking of the first
column of expected univariate improvement values.
Regarding trace=TRUE
: Samples from the posterior will be
collected for all parameters in the model. GP parameters are collected
with reference to the locations in XX
, resulting
nn=nrow{XX}
traces of d,g,s2,tau2
, etc. Therefore, it
is recommended that nn
is chosen to be a small, representative,
set of input locations. Besides GP parameters, traces are saved for
the tree partitions, areas under the LLM, log posterior (as a function
of tree height), and samples from the posterior predictive
distributions. Note that since some traces are stored in
files, multiple tgp
/R sessions in the same working
directory can clobber the trace files of other sessions
Author(s)
Robert B. Gramacy, rbgramacy@chicagobooth.edu, and Matt Taddy, taddy@chicagobooth.edu
References
Gramacy, R. B. (2007). tgp: An R Package for Bayesian Nonstationary, Semiparametric Nonlinear Regression and Design by Treed Gaussian Process Models. Journal of Statistical Software, 19(9). http://www.jstatsoft.org/v19/i09
Robert B. Gramacy, Matthew Taddy (2010). Categorical Inputs, Sensitivity Analysis, Optimization and Importance Tempering with tgp Version 2, an R Package for Treed Gaussian Process Models. Journal of Statistical Software, 33(6), 1–48. http://www.jstatsoft.org/v33/i06/.
Gramacy, R. B., Lee, H. K. H. (2007). Bayesian treed Gaussian process models with an application to computer modeling Journal of the American Statistical Association, to appear. Also available as ArXiv article 0710.4536 http://arxiv.org/abs/0710.4536
Gramacy, R. B. and Lee, K.H. (2008). Gaussian Processes and Limiting Linear Models. Computational Statistics and Data Analysis, 53, pp. 123136. Also available as ArXiv article 0804.4685 http://arxiv.org/abs/0804.4685
Gramacy, R. B., Lee, H. K. H. (2009). Adaptive design and analysis of supercomputer experiments. Technometrics, to appear. Also avaliable on ArXiv article 0805.4359 http://arxiv.org/abs/0805.4359
Robert B. Gramacy, Heng Lian (2011). Gaussian process singleindex models as emulators for computer experiments. Available as ArXiv article 1009.4241 http://arxiv.org/abs/1009.4241
Chipman, H., George, E., \& McCulloch, R. (1998). Bayesian CART model search (with discussion). Journal of the American Statistical Association, 93, 935–960.
Chipman, H., George, E., \& McCulloch, R. (2002). Bayesian treed models. Machine Learning, 48, 303–324.
M. Schonlau and Jones, D.R. and Welch, W.J. (1998). Global versus local search in constrained optimization of computer models. In "New Developments and applications in experimental design", IMS Lecture Notes  Monograph Series 34. 11–25.
http://bobby.gramacy.com/r_packages/tgp
See Also
plot.tgp
, tgp.trees
,
predict.tgp
, sens
, default.itemps
Examples
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77  ##
## Many of the examples below illustrate the above
## function(s) on random data. Thus it can be fun
## (and informative) to run them several times.
##
#
# simple linear response
#
# input and predictive data
X < seq(0,1,length=50)
XX < seq(0,1,length=99)
Z < 1 + 2*X + rnorm(length(X),sd=0.25)
out < blm(X=X, Z=Z, XX=XX) # try Linear Model
plot(out) # plot the surface
#
# 1d Example
#
# construct some 1d nonstationary data
X < seq(0,20,length=100)
XX < seq(0,20,length=99)
Z < (sin(pi*X/5) + 0.2*cos(4*pi*X/5)) * (X <= 9.6)
lin < X>9.6;
Z[lin] < 1 + X[lin]/10
Z < Z + rnorm(length(Z), sd=0.1)
out < btlm(X=X, Z=Z, XX=XX) # try Linear CART
plot(out) # plot the surface
tgp.trees(out) # plot the MAP trees
out < btgp(X=X, Z=Z, XX=XX) # use a treed GP
plot(out) # plot the surface
tgp.trees(out) # plot the MAP trees
#
# 2d example
# (using the isotropic correlation function)
#
# construct some 2d nonstationary data
exp2d.data < exp2d.rand()
X < exp2d.data$X; Z < exp2d.data$Z
XX < exp2d.data$XX
# try a GP
out < bgp(X=X, Z=Z, XX=XX, corr="exp")
plot(out) # plot the surface
# try a treed GP LLM
out < btgpllm(X=X, Z=Z, XX=XX, corr="exp")
plot(out) # plot the surface
tgp.trees(out) # plot the MAP trees
#
# Motorcycle Accident Data
#
# get the data
require(MASS)
# try a GP
out < bgp(X=mcycle[,1], Z=mcycle[,2])
plot(out) # plot the surface
# try a treed GP LLM
# best to use the "b0" beta linear prior to capture common
# common linear process throughout all regions (using the
# ellipses "...")
out < btgpllm(X=mcycle[,1], Z=mcycle[,2], bprior="b0")
plot(out) # plot the surface
tgp.trees(out) # plot the MAP trees
