randLine: Generate two-dimensional random paths In laGP: Local Approximate Gaussian Process Regression

Description

Generate two-dimensional random paths (one-dimensional manifolds in 2d) comprising of different randomly chosen line types: linear, quadratic, cubic, exponential, and natural logarithm. If the input dimensionality is higher than 2, then a line in two randomly chosen input coordinates is generated

Usage

 1 randLine(a, D, N, smin, res)

Arguments

 a a fixed two-element vector denoting the range of the bounding box (lower bound and upper bound) of all input coordinates D a scalar denoting the dimensionality of input space N a scalar denoting the desired total number of random lines smin a scalar denoting the minimum absolute scaling constant, i.e., the length of the shortest line that could be generated res a scalar denoting the number of data points, i.e., the resolution on the random path

Details

This two-dimensional random line generating function produces different types of 2d random paths, including linear, quadratic, cubic, exponential, and natural logarithm.

First, one of these line types is chosen uniformly at random. The line is then drawn, via a collection of discrete points, from the origin according to the arguments, e.g., resolution and length, provided by the user. The discrete set of coordinates are then shifted and scaled, uniformly at random, into the specified 2d rectangle, e.g., [-2,2]^2, with the restriction that at least half of the points comprising the line lie within the rectangle.

For a quick visualization, see Figure 15 in Sun, et al. (2017). Figure 7 in the same manuscript illustrates the application of this function in out-of-sample prediction using laGPsep, in 2d and 4d, respectively.

randLine returns different types of random paths and the indices of the randomly selected pair, i.e., subset, of input coordinates (when D > 2).

Value

randLine returns a list of lists. The outer list is of length six, representing each of the five possible line types (linear, quadratic, cubic, exponential, and natural logarithm), with the sixth entry providing the randomly chosen input dimensions.

The inner lists are comprised of res * 2 data.frames, the number of which span N samples across all inner lists.

Note

Users should scale each coordinate of global input space to the same coded range, e.g., [-2,2]^D, in order to avoid computational burden caused by passing global input space argument. Users may convert back to the natural units when necessary.

Author(s)

Furong Sun furongs@vt.edu and Robert B. Gramacy rbg@vt.edu

References

F. Sun, R.B. Gramacy, B. Haaland, E. Lawrence, and A. Walker (2019). Emulating satellite drag from large simulation experiments, SIAM/ASA Journal on Uncertainty Quantification, 7(2), pp. 720-759; preprint on arXiv:1712.00182; http://arxiv.org/abs/1712.00182