randLine: Generate two-dimensional random paths
In laGP: Local Approximate Gaussian Process Regression

randLine

R Documentation

Generate two-dimensional random paths

Description

Generate two-dimensional random paths (one-dimensional manifolds in 2d) comprising of different randomly chosen line types: linear, quadratic, cubic, exponential, and natural logarithm. If the input dimensionality is higher than 2, then a line in two randomly chosen input coordinates is generated

Usage

  randLine(a, D, N, smin, res)

Arguments

`a`	a fixed two-element vector denoting the range of the bounding box (lower bound and upper bound) of all input coordinates
`D`	a scalar denoting the dimensionality of input space
`N`	a scalar denoting the desired total number of random lines
`smin`	a scalar denoting the minimum absolute scaling constant, i.e., the length of the shortest line that could be generated
`res`	a scalar denoting the number of data points, i.e., the resolution on the random path

Details

This two-dimensional random line generating function produces different types of 2d random paths, including linear, quadratic, cubic, exponential, and natural logarithm.

First, one of these line types is chosen uniformly at random. The line is then drawn, via a collection of discrete points, from the origin according to the arguments, e.g., resolution and length, provided by the user. The discrete set of coordinates are then shifted and scaled, uniformly at random, into the specified 2d rectangle, e.g., [-2,2]^2, with the restriction that at least half of the points comprising the line lie within the rectangle.

For a quick visualization, see Figure 15 in Sun, et al. (2017). Figure 7 in the same manuscript illustrates the application of this function in out-of-sample prediction using laGPsep, in 2d and 4d, respectively.

randLine returns different types of random paths and the indices of the randomly selected pair, i.e., subset, of input coordinates (when D > 2).

Value

randLine returns a list of lists. The outer list is of length six, representing each of the five possible line types (linear, quadratic, cubic, exponential, and natural logarithm), with the sixth entry providing the randomly chosen input dimensions.

The inner lists are comprised of res \times 2 data.frames, the number of which span N samples across all inner lists.

Note

Users should scale each coordinate of global input space to the same coded range, e.g., [-2,2]^D, in order to avoid computational burden caused by passing global input space argument. Users may convert back to the natural units when necessary.

Author(s)

Furong Sun furongs@vt.edu and Robert B. Gramacy rbg@vt.edu

References

F. Sun, R.B. Gramacy, B. Haaland, E. Lawrence, and A. Walker (2019). Emulating satellite drag from large simulation experiments, SIAM/ASA Journal on Uncertainty Quantification, 7(2), pp. 720-759; preprint on arXiv:1712.00182; https://arxiv.org/abs/1712.00182

Examples

## 1. visualization of the randomly generated paths

## generate the paths
D <- 4
a <- c(-2, 2)
N <- 30
smin <- 0.1
res <- 100
line.set <- randLine(a=a, D=D, N=N, smin=smin, res=res)
  
## the indices of the randomly selected pair of input coordinates
d <- line.set$d

## visualization

## first create an empty plot
par(mar=c(5, 4, 6, 2) + 0.1)
plot(0, xlim=a, ylim=a, type="l", xlab=paste("factor ", d[1], sep=""), 
     ylab=paste("factor ", d[2], sep=""), main="2d random paths", 
     cex.lab=1.5, cex.main=2)
abline(h=(a[1]+a[2])/2, v=(a[1]+a[2])/2, lty=2)

## merge each path type together
W <- unlist(list(line.set$lin, line.set$qua, line.set$cub, line.set$ep, line.set$ln), 
  recursive=FALSE)

## calculate colors to retain 
n <- unlist(lapply(line.set, length)[-6])
cols <- rep(c("orange", "blue", "forestgreen", "magenta", "cornflowerblue"), n)

## plot randomly generated paths with a centering dot in red at the midway point
for(i in 1:N){
  lines(W[[i]][,1], W[[i]][,2], col=cols[i])
  points(W[[i]][res/2,1], W[[i]][res/2,2], col=2, pch=20)
}

## add legend
legend("top", legend=c("lin", "qua", "cub", "exp", "log"), cex=1.5, bty="n",
       xpd=TRUE, horiz=TRUE, inset=c(0, -0.085), lty=rep(1, 5), lwd=rep(1, 5),
       col=c("orange", "blue", "forestgreen", "magenta", "cornflowerblue"))

## 2. use the random paths for out-of-sample prediction via laGPsep

## test function (same 2d function as in other examples package)
## (ignoring 4d nature of path generation above)
f2d <- function(x, y=NULL){
  if(is.null(y)){
     if(!is.matrix(x) && !is.data.frame(x)) x <- matrix(x, ncol=2)
     y <- x[,2]; x <- x[,1]
  }
  g <- function(z)
  return(exp(-(z-1)^2) + exp(-0.8*(z+1)^2) - 0.05*sin(8*(z+0.1)))
  z <- -g(x)*g(y)
}
    
## generate training data using 2d input space
x <- seq(a[1], a[2], by=0.02)
X <- as.matrix(expand.grid(x, x))
Y <- f2d(X)

## example of joint path calculation folowed by RMSE calculation
## on the first random path
WW <- W[[sample(1:N, 1)]]
WY <- f2d(WW)

## exhaustive search via ``joint" ALC
j.exh <- laGPsep(WW, 6, 100, X, Y, method="alcopt", close=10000, lite=FALSE)
sqrt(mean((WY - j.exh$mean)^2)) ## RMSE

## repeat for all thirty path elements (way too slow for checking) and other
## local design choices and visualize RMSE distribution(s) side-by-side
## Not run:    
  ## pre-allocate to save RMSE
  rmse.exh <- rmse.opt <- rmse.nn <- rmse.pw <- rmse.pwnn <- rep(NA, N)
  for(t in 1:N){
     
    WW <- W[[t]]
    WY <- f2d(WW)
       
    ## joint local design exhaustive search via ALC
    j.exh <- laGPsep(WW, 6, 100, X, Y, method="alc", close=10000, lite=FALSE)
    rmse.exh[t] <- sqrt(mean((WY - j.exh$mean)^2))
     
    ## joint local design gradient-based search via ALC
    j.opt <- laGPsep(WW, 6, 100, X, Y, method="alcopt", close=10000, lite=FALSE)
    rmse.opt[t] <- sqrt(mean((WY - j.opt$mean)^2))
    
    ## joint local design exhaustive search via NN
    j.nn <- laGPsep(WW, 6, 100, X, Y, method="nn", close=10000, lite=FALSE)
    rmse.nn[t] <- sqrt(mean((WY - j.nn$mean)^2))
     
    ## pointwise local design via ALC
    pw <- aGPsep(X, Y, WW, start=6, end=50, d=list(max=20), method="alc", verb=0)
    rmse.pw[t] <- sqrt(mean((WY - pw$mean)^2))
     
    ## pointwise local design via NN
    pw.nn <- aGPsep(X, Y, WW, start=6, end=50, d=list(max=20), method="nn", verb=0)   
    rmse.pwnn[t] <- sqrt(mean((WY - pw.nn$mean)^2))
     
    ## progress meter
    print(t)
  }
  
  ## justify the y range
  ylim_RMSE <- log(range(rmse.exh, rmse.opt, rmse.nn, rmse.pw, rmse.pwnn))
     
  ## plot the distribution of RMSE output
  boxplot(log(rmse.exh), log(rmse.opt), log(rmse.nn), log(rmse.pw), log(rmse.pwnn),
          xaxt='n', xlab="", ylab="log(RMSE)", ylim=ylim_RMSE, main="")
  axis(1, at=1:5, labels=c("ALC-ex", "ALC-opt", "NN", "ALC-pw", "NN-pw"), las=1)

## End(Not run)

laGP documentation built on March 31, 2023, 9:46 p.m.