optimPPL  R Documentation 
Optimize a sample configuration for variogram identification and estimation. A criterion is defined so that the optimized sample configuration has a given number of points or pointpairs contributing to each lagdistance class (PPL).
optimPPL(
points,
candi,
lags = 7,
lags.type = "exponential",
lags.base = 2,
cutoff,
distri,
criterion = "distribution",
pairs = FALSE,
schedule,
plotit = FALSE,
track = FALSE,
boundary,
progress = "txt",
verbose = FALSE
)
objPPL(
points,
candi,
lags = 7,
lags.type = "exponential",
lags.base = 2,
cutoff,
distri,
criterion = "distribution",
pairs = FALSE,
x.max,
x.min,
y.max,
y.min
)
countPPL(
points,
candi,
lags = 7,
lags.type = "exponential",
lags.base = 2,
cutoff,
pairs = FALSE,
x.max,
x.min,
y.max,
y.min
)
points 
Integer value, integer vector, data frame (or matrix), or list. The number of sampling points (sample size) or the starting sample configuration. Four options are available:
Most users will want to set an integer value simply specifying the required sample size. Using an integer vector or data frame (or matrix) will generally be helpful to users willing to evaluate starting sample configurations, test strategies to speed up the optimization, and finetune or thin an existing sample configuration. Users interested in augmenting a possibly existing realworld sample configuration or finetuning only a subset of the existing sampling points will want to use a list. 
candi 
Data frame (or matrix). The Cartesian x and ycoordinates (in this order) of the
cell centres of a spatially exhaustive, rectangular grid covering the entire spatial sampling
domain. The spatial sampling domain can be contiguous or composed of disjoint areas and contain
holes and islands. 
lags 
Integer value, the number of lagdistance classes. Alternatively, a vector of numeric
values with the lower and upper bounds of each lagdistance class, the lowest value being larger
than zero (e.g. 0.0001). Defaults to 
lags.type 
Character value, the type of lagdistance classes, with options 
lags.base 
Numeric value, base of the exponential expression used to create exponentially
spaced lagdistance classes. Used only when 
cutoff 
Numeric value, the maximum distance up to which lagdistance classes are created.
Used only when 
distri 
Numeric vector, the distribution of points or pointpairs per lagdistance class
that should be attained at the end of the optimization. Used only when

criterion 
Character value, the feature used to describe the energy state of the system
configuration, with options 
pairs 
Logical value. Should the sample configuration be optimized regarding the number of
pointpairs per lagdistance class? Defaults to 
schedule 
List with named subarguments setting the control parameters of the annealing
schedule. See 
plotit 
(Optional) Logical for plotting the evolution of the optimization. Plot updates
occur at each ten (10) spatial jitters. Defaults to

track 
(Optional) Logical value. Should the evolution of the energy state be recorded and
returned along with the result? If 
boundary 
(Optional) An object of class SpatialPolygons (see sp::SpatialPolygons()) with
the outer and inner limits of the spatial sampling domain (see 
progress 
(Optional) Type of progress bar that should be used, with options 
verbose 
(Optional) Logical for printing messages about the progress of the optimization.
Defaults to 
x.max, x.min, y.max, y.min 
Numeric value defining the minimum and maximum quantity of random noise to
be added to the projected x and ycoordinates. The minimum quantity should be equal to, at least, the
minimum distance between two neighbouring candidate points. The units are the same as of the projected
x and ycoordinates. If missing, they are estimated from 
There are multiple mechanism to generate a new sample configuration out of an existing one. The main step consists of randomly perturbing the coordinates of a single sample, a process known as ‘jittering’. These mechanisms can be classified based on how the set of candidate locations for the samples is defined. For example, one could use an infinite set of candidate locations, that is, any location in the spatial domain can be selected as a new sample location after a sample is jittered. All that is needed is a polygon indicating the boundary of the spatial domain. This method is more computationally demanding because every time an existing sample is jittered, it is necessary to check if the new sample location falls in spatial domain.
Another approach consists of using a finite set of candidate locations for the samples. A finite set of candidate locations is created by discretising the spatial domain, that is, creating a fine (regular) grid of points that serve as candidate locations for the jittered sample. This is a less computationally demanding jittering method because, by definition, the new sample location will always fall in the spatial domain.
Using a finite set of candidate locations has two important inconveniences. First, not all locations in the spatial domain can be selected as the new location for a jittered sample. Second, when a sample is jittered, it may be that the new location already is occupied by another sample. If this happens, another location has to be iteratively sought for, say, as many times as the size of the sample configuration. In general, the larger the size of the sample configuration, the more likely it is that the new location already is occupied by another sample. If a solution is not found in a reasonable time, the the sample selected to be jittered is kept in its original location. Such a procedure clearly is suboptimal.
spsann uses a more elegant method which is based on using a finite set of candidate locations
coupled with a form of twostage random sampling as implemented in spcosa::spsample()
.
Because the candidate locations are placed on a finite regular grid, they can be taken as the
centre nodes of a finite set of grid cells (or pixels of a raster image). In the first stage, one
of the “grid cells” is selected with replacement, i.e. independently of already being
occupied by another sample. The new location for the sample chosen to be jittered is selected
within that “grid cell” by simple random sampling. This method guarantees that virtually
any location in the spatial domain can be selected. It also discards the need to check if the new
location already is occupied by another sample, speeding up the computations when compared to the
first two approaches.
Two types of lagdistance classes can be created by default. The first are evenly spaced lags
(lags.type = "equidistant"
). They are created by simply dividing the distance interval from
0.0001 to cutoff
by the required number of lags. The minimum value of 0.0001 guarantees that a
point does not form a pair with itself. The second type of lags is defined by exponential
spacings (lags.type = "exponential"
). The spacings are defined by the base b
of the
exponential expression b^n
, where n
is the required number of lags. The base is
defined using the argument lags.base
. See pedometrics::vgmLags()
for other details.
Using the default uniform distribution means that the number of pointpairs per lagdistance
class (pairs = TRUE
) is equal to n \times (n  1) / (2 \times lag)
, where n
is the
total number of points and lag
is the number of lags. If pairs = FALSE
, then it means
that the number of points per lag is equal to the total number of points. This is the same as
expecting that each point contributes to every lag. Distributions other than the available
options can be easily implemented changing the arguments lags
and distri
.
There are two optimizing criteria implemented. The first is called using
criterion = "distribution"
and is used to minimize the sum of the absolute differences between
a prespecified distribution and the observed distribution of points or pointpairs per
lagdistance class. The second criterion is called using criterion = "minimum"
. It corresponds
to maximizing the minimum number of points or pointpairs observed over all lagdistance classes.
optimPPL
returns an object of class OptimizedSampleConfiguration
: the optimized sample
configuration with details about the optimization.
objPPL
returns a numeric value: the energy state of the sample configuration – the objective
function value.
countPPL
returns a data.frame with three columns: a) the lower and b) upper limits of each
lagdistance class, and c) the number of points or pointpairs per lagdistance class.
spsann always computes the distance between two locations (points) as the Euclidean distance between them. This computation requires the optimization to operate in the twodimensional Euclidean space, i.e. the coordinates of the sample, candidate and evaluation locations must be Cartesian coordinates, generally in metres or kilometres. spsann has no mechanism to check if the coordinates are Cartesian: you are the sole responsible for making sure that this requirement is attained.
Alessandro SamuelRosa alessandrosamuelrosa@gmail.com
Bresler, E.; Green, R. E. Soil parameters and sampling scheme for characterizing soil hydraulic properties of a watershed. Honolulu: University of Hawaii at Manoa, p. 42, 1982.
Pettitt, A. N.; McBratney, A. B. Sampling designs for estimating spatial variance components. Applied Statistics. v. 42, p. 185, 1993.
Russo, D. Design of an optimal sampling network for estimating the variogram. Soil Science Society of America Journal. v. 48, p. 708716, 1984.
Truong, P. N.; Heuvelink, G. B. M.; Gosling, J. P. Webbased tool for expert elicitation of the variogram. Computers and Geosciences. v. 51, p. 390399, 2013.
Warrick, A. W.; Myers, D. E. Optimization of sampling locations for variogram calculations. Water Resources Research. v. 23, p. 496500, 1987.
#####################################################################
# NOTE: The settings below are unlikely to meet your needs. #
#####################################################################
if (interactive() & require(sp)) {
# This example takes more than 5 seconds
data(meuse.grid, package = "sp")
schedule < scheduleSPSANN(
chains = 1,
initial.acceptance = c(0.8, 0.99),
initial.temperature = 9.5,
x.max = 1540, y.max = 2060, x.min = 0,
y.min = 0, cellsize = 40)
set.seed(2001)
res < optimPPL(points = 10, candi = meuse.grid[, 1:2],
schedule = schedule)
objSPSANN(res)
}
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.