Select samples from along an environmental gradient by splitting the gradient into discrete chunks and sample within each chunk. This allows a test set to be selected which covers the environmental gradient of the training set, for example.
1 2 3
numeric; vector of samples representing the gradient values.
numeric; number of chunks to split the gradient into.
numeric; how many samples to take from the gradient. Can not be missing.
numeric; number of samples per chunk. Must be a vector
character; the type of filling of chunks to perform. See Details.
numeric; maximum number of iterations in which to try to
The gradient is split into
chunk sections and samples are
selected from each chunk to result in a sample of length
take is divisible by
remainder then there will an equal number of samples selected from
each chunk. Where
chunk is not a multiple of
nchunk is not specified then extra samples have to be allocated
to some of the chunks to reach the required number of samples
An additional complication is that some chunks of the gradient may
have fewer than
nchunk samples and therefore more samples need
to be selected from the remaining chunks until
take samples are
nchunk is supplied, it must be a vector stating exactly how
many samples to select from each chunk. If
chunk is not
supplied, then the number of samples per chunk is determined as
An intial allocation of
floor(take / chunk) is assigned
to each chunk
If any chunks have fewer samples than this initial allocation,
these elements of
nchunk are reset to the number of samples
in those chunks
Sequentially an extra sample is allocated to each chunk with
sufficient available samples until
take samples are
fill controls the order in which the chunks are
fill = "head" fills from the low to the high end of the
fill = "tail" fills in the opposite
direction. Chunks are filled in random order if
"random". In all cases no chunk is filled by more than one extra
sample until all chunks that can supply one extra sample are
filled. In the case of
fill = "head" or
fill = "tail"
this entails moving along the gradient from one end to the other
allocating an extra sample to available chunks before starting along
the gradient again. For
fill = "random", a random order of
chunks to fill is determined, if an extra sample is allocated to each
chunk in the random order and
take samples are still not
selected, filling begins again using the same random ordering. In
other words, the random order of chunks to fill is chosen only once.
A numeric vector of indices of selected samples. This vector has
lengths which indicates how many samples were
actually chosen from each chunk.
Gavin L. Simpson
1 2 3 4 5 6 7 8 9 10 11 12 13
data(swappH) ## take a test set of 20 samples along the pH gradient test1 <- splitSample(swappH, chunk = 10, take = 20) test1 swappH[test1] ## take a larger sample where some chunks don't have many samples ## do random filling set.seed(3) test2 <- splitSample(swappH, chunk = 10, take = 70, fill = "random") test2 swappH[test2]
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.