# exp2d.rand: Random 2-d Exponential Data In tgp: Bayesian Treed Gaussian Process Models

## Description

A Random subsample of `data(exp2d)`, or Latin Hypercube sampled data evaluated with `exp2d.Z`

## Usage

 `1` ```exp2d.rand(n1 = 50, n2 = 30, lh = NULL, dopt = 1) ```

## Arguments

 `n1` Number of samples from the first, interesting, quadrant `n2` Number of samples from the other three, uninteresting, quadrants `lh` If `!is.null(lh)` then Latin Hypercube (LH) sampling (`lhs`) is used instead of subsampling from `data(exp2d)`; `lh` should be a single nonnegative integer specifying the desired number of predictive locations, `XX`; or, it should be a vector of length 4, specifying the number of predictive locations desired from each of the four quadrants (interesting quadrant first, then counter-clockwise) `dopt` If `dopt >= 2` then d-optimal subsampling from LH candidates of the multiple indicated by the value of `dopt` will be used. This argument only makes sense when `!is.null(lh)`

## Details

When `is.null(lh)`, data is subsampled without replacement from `data(exp2d)`. Of the `n1 + n2 <= 441` input/response pairs `X,Z`, there are `n1` are taken from the first quadrant, i.e., where the response is interesting, and the remaining `n2` are taken from the other three quadrants. The remaining `441 - (n1 + n2)` are treated as predictive locations

Otherwise, when `!is.null(lh)`, Latin Hypercube Sampling (`lhs`) is used

If `dopt >= 2` then `n1*dopt` LH candidates are used for to get a D-optimal subsample of size `n1` from the first (interesting) quadrant. Similarly `n2*dopt` in the rest of the un-interesting region. A total of `lh*dopt` candidates will be used for sequential D-optimal subsampling for predictive locations `XX` in all four quadrants assuming the already-sampled `X` locations will be in the design.

In all three cases, the response is evaluated as

Z(X) = X1 * exp(-X1^2-X2^2),

thus creating the outputs `Ztrue` and `ZZtrue`. Zero-mean normal noise with `sd=0.001` is added to the responses `Z` and `ZZ`

## Value

Output is a `list` with entries:

 `X` 2-d `data.frame` with `n1 + n2` input locations `Z` Numeric vector describing the responses (with noise) at the `X` input locations `Ztrue` Numeric vector describing the true responses (without noise) at the `X` input locations `XX` 2-d `data.frame` containing the remaining `441 - (n1 + n2)` input locations `ZZ` Numeric vector describing the responses (with noise) at the `XX` predictive locations `ZZtrue` Numeric vector describing the responses (without noise) at the `XX` predictive locations

## Author(s)

Robert B. Gramacy, rbg@vt.edu, and Matt Taddy, mataddy@amazon.com

## References

Gramacy, R. B. (2007). tgp: An R Package for Bayesian Nonstationary, Semiparametric Nonlinear Regression and Design by Treed Gaussian Process Models. Journal of Statistical Software, 19(9). https://www.jstatsoft.org/v19/i09

Gramacy, R. B., Lee, H. K. H. (2008). Bayesian treed Gaussian process models with an application to computer modeling. Journal of the American Statistical Association, 103(483), pp. 1119-1130. Also available as ArXiv article 0710.4536 https://arxiv.org/abs/0710.4536

## See Also

`lhs`, `exp2d`, `exp2d.Z`, `btgp`, and other `b*` functions

## Examples

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59``` ```## randomly subsampled data ## ------------------------ eds <- exp2d.rand() # higher span = 0.5 required because the data is sparse # and was generated randomly eds.g <- interp.loess(eds\$X[,1], eds\$X[,2], eds\$Z, span=0.5) # perspective plot, and plot of the input (X & XX) locations par(mfrow=c(1,2), bty="n") persp(eds.g, main="loess surface", theta=-30, phi=20, xlab="X[,1]", ylab="X[,2]", zlab="Z") plot(eds\$X, main="Randomly Subsampled Inputs") points(eds\$XX, pch=19, cex=0.5) ## Latin Hypercube sampled data ## ---------------------------- edlh <- exp2d.rand(lh=c(20, 15, 10, 5)) # higher span = 0.5 required because the data is sparse # and was generated randomly edlh.g <- interp.loess(edlh\$X[,1], edlh\$X[,2], edlh\$Z, span=0.5) # perspective plot, and plot of the input (X & XX) locations par(mfrow=c(1,2), bty="n") persp(edlh.g, main="loess surface", theta=-30, phi=20, xlab="X[,1]", ylab="X[,2]", zlab="Z") plot(edlh\$X, main="Latin Hypercube Sampled Inputs") points(edlh\$XX, pch=19, cex=0.5) # show the quadrants abline(h=2, col=2, lty=2, lwd=2) abline(v=2, col=2, lty=2, lwd=2) ## Not run: ## D-optimal subsample with a factor of 10 (more) candidates ## --------------------------------------------------------- edlhd <- exp2d.rand(lh=c(20, 15, 10, 5), dopt=10) # higher span = 0.5 required because the data is sparse # and was generated randomly edlhd.g <- interp.loess(edlhd\$X[,1], edlhd\$X[,2], edlhd\$Z, span=0.5) # perspective plot, and plot of the input (X & XX) locations par(mfrow=c(1,2), bty="n") persp(edlhd.g, main="loess surface", theta=-30, phi=20, xlab="X[,1]", ylab="X[,2]", zlab="Z") plot(edlhd\$X, main="D-optimally Sampled Inputs") points(edlhd\$XX, pch=19, cex=0.5) # show the quadrants abline(h=2, col=2, lty=2, lwd=2) abline(v=2, col=2, lty=2, lwd=2) ## End(Not run) ```

tgp documentation built on Jan. 13, 2021, 3:49 p.m.