optIP.wIMSE: Sequential Selection of an Inducing Point Design by...

Description Usage Arguments Details Value Author(s) References Examples

View source: R/wimse.R

Description

Optimizes the weighted integrated mean-square error (wIMSE) surface to sequentially select inducing points for a given predictive location and local neighborhood.

Usage

1
2
3
4
optIP.wIMSE(Xn, M, theta = NULL, g = 1e-4, w_mean, w_var = NULL,
            ip_bounds = NULL, integral_bounds = NULL, num_multistart = 15,
            fix_xm1 = TRUE, epsK = sqrt(.Machine$double.eps), epsQ = 1e-5,
            mult = NULL, verbose = TRUE)

Arguments

Xn

a matrix of the local neighborhood; nrow(Xn)=N

M

the positive integer number of inducing points placed for each local neighborhood; M should be less than N

theta

the lengthscale parameter (positive number) in a Gaussian correlation function; a (default) NULL value sets the lengthscale at the square of the 10th percentile of pairwise distances between neighborhood points (see darg in laGP package)

g

the nugget parameter (positive number) in the covariance

w_mean

a vector of the mean (center) of the Gaussian weight; length(w_mean) should equal ncol(Xn)

w_var

a positive number or vector of positive numbers (length equal to ncol(Xn)) denoting the variance(s) in the Gaussian weight. If NULL (default), theta is used.

ip_bounds

a 2 by d matrix containing the range of possible values for inducing points; first row contains minimum values for each dimension, second row contains maximum values; if ip_bounds is NULL, defaults to range of the local neighborhood Xn

integral_bounds

a 2 by d matrix containing the domain bounds for the data; first row contains minimum values for each dimension, second row contains maximum values; if integral_bounds=NULL, defaults to range of the local neighborhood Xn

num_multistart

a positive integer indicating the number of starting locations used to optimize wIMSE for each inducing point

fix_xm1

an indicator of whether or not the first inducing point should be fixed at w_mean (TRUE, default) or optimized (FALSE)

epsK

a small positive number added to the diagonal of the correlation matrix, of inducing points, K, for numerically stability for inversion

epsQ

a small positive number added to the diagonal of the Q matrix (see Cole (2021)) for numerically stability for inversion

mult

a vector of length nrow(X) that contains the number of replicates for each design location in X

verbose

if TRUE, outputs the progress of the number of inducing points optimally placed

Details

The function sequentially places M inducing points around the local neighborhood (Xn). Inducing points are placed based on the minimum in the wIMSE surface integrating over integral_bounds. Hyperparameters theta,g are fixed.

Value

The output is a list with the following components:

Xm

a matrix of the locally optimal inducing point locations

wimse

a vector of the wIMSE progress at each inducing point selection step

time

a scalar giving the passage of wall-clock time elapsed for (substantive parts of) the calculation

Author(s)

D. Austin Cole austin.cole8@vt.edu

References

D.A. Cole, R.B. Christianson, and R.B. Gramacy (2021). Locally Induced Gaussian Processes for Large-Scale Simulation Experiments Statistics and Computing, 31(3), 1-21; preprint on arXiv:2008.12857; https://arxiv.org/abs/2008.12857

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
## a "computer experiment"

## Simple 2-d Herbie's Tooth function used in Cole et al (2020);
## thanks to Lee, Gramacy, Taddy, and others who have used it before

## Build up a design with N=~40K locations
x <- seq(-2, 2, by=0.02)
X <- as.matrix(expand.grid(x, x))
Y <- herbtooth(X)


library(laGP)

## Build a local neighborhood
Xstar <- matrix(c(.4, -1.1), nrow=1)
n <- 100; m <- 10
Xstar_to_X_dists <- distance(Xstar, X)
quant_dists <- quantile(Xstar_to_X_dists, n/nrow(X))
Xn <- X[Xstar_to_X_dists < quant_dists,]
Yn <- Y[Xstar_to_X_dists < quant_dists]

theta <- darg(NULL, Xn)$start
integral_bounds <- rbind(c(-2,-2), c(2,2))

## Optimize inducing point locations
Xm.wimse1 <- optIP.wIMSE(Xn, M=m, Xn=, theta=theta, w_mean=Xstar,
                        integral_bounds=integral_bounds)

## Use a different variance for weight
Xm.wimse2 <- optIP.wIMSE(Xn, M=m, Xn=, theta=theta, w_mean=Xstar,
                        w_var = c(theta/2, 3*theta),
                        integral_bounds=integral_bounds)

## Plot inducing point design and neighborhood
ranges <- apply(Xn, 2, range)
plot(Xn, pch = 16, cex=.5, col='grey',
     xlim=ranges[,1]+c(-.02, .02), ylim=ranges[,2]+c(-.02, .15),
     xlab='x1', ylab = 'x2',
     main='ALC-optimal Inducing Point Design')
points(Xstar[1], Xstar[2], pch=16)
points(Xm.wimse1$Xm, pch=2, col=3, lwd=2)
points(Xm.wimse2$Xm, pch=3, col=4, lwd=2)
legend('topleft', legend=c('Xstar','Neighborhood','Xm with w_var=theta',
                           'Xm with nonisotropic weight'),
       col=c(1,'grey',3,4), pch=c(16,16,2,3), lty=NA, lwd=2, ncol=2)

## View wIMSE progress
plot(1:m, log(Xm.wimse1$wimse), type='l', xlab='inducing point number',
     ylab='log wIMSE',main='wIMSE optimization progress')

liGP documentation built on July 17, 2021, 9:08 a.m.

Related to optIP.wIMSE in liGP...