View source: R/sampleOccurrences.R
sampleOccurrences | R Documentation |
This function samples occurrences/records (presence only or presence-absence) within a species distribution, either randomly or with a sampling bias. The sampling bias can be defined manually or with a set of predefined biases.
sampleOccurrences(
x,
n,
type = "presence only",
extract.probability = FALSE,
sampling.area = NULL,
detection.probability = 1,
correct.by.suitability = FALSE,
error.probability = 0,
bias = "no.bias",
bias.strength = 50,
bias.area = NULL,
weights = NULL,
sample.prevalence = NULL,
replacement = FALSE,
plot = TRUE
)
x |
a |
n |
an integer. The number of occurrence points / records to sample. |
type |
|
extract.probability |
|
sampling.area |
a character string, a |
detection.probability |
a numeric value between 0 and 1, corresponding to the probability of detection of the species. See details. |
correct.by.suitability |
|
error.probability |
|
bias |
|
bias.strength |
a positive numeric value. The strength of the bias to be
applied in |
bias.area |
|
weights |
|
sample.prevalence |
|
replacement |
|
plot |
|
Online tutorial for this function
How the function works:
The function randomly selects n
cells in which samples occur. If a
bias
is chosen, then the selection of these cells will be biased
according to the type and strength of bias chosen. If the sampling is of
type "presence only"
, then only cells where the species is present
will be chosen. If the sampling is of type "presence-absence"
, then
all non-NA cells can be chosen.
The function then samples the species inside the chosen cells. In cells
where the species is present the species will always be sampled unless
the parameter detection.probability
is lower than 1. In that case the
species will be sampled with the associated probability of detection.
In cells where the species is absent (in case of a "presence-absence"
sampling), the function will always assign absence unless
error.probability
is greater than 1. In that case, the species can be
found present with the associated probability of error. Note that this step
happens AFTER the detection step. Hence, in cells where the species is
present but not detected, it can still be sampled due to a sampling error.
How to restrict the sampling area:
Use the argument sampling.area
:
Provide the name (s) (or a combination of names) of country(ies), region(s) or continent(s). Examples:
sampling.area = "Africa"
sampling.area = c("Africa", "North America", "France")
Provide a polygon (SpatialPolygons
or
SpatialPolygonsDataFrame
of package sp
)
Provide an extent
object
How the sampling bias works:
The argument bias.strength
indicates the strength of the bias.
For example, a value of 50 will result in 50 times more samples within the
bias.area
than outside.
Conversely, a value of 0.5 will result in half less samples within the
bias.area
than outside.
How to choose where the sampling is biased:
You can choose to bias the sampling in:
a particular country, region or continent (assuming your raster has the WGS84 projection):
Set the argument
bias
to "country"
, "region"
or
"continent"
, and provide the name(s) of the associated countries,
regions or continents to bias.area
(see examples).
List of possible bias.area
names:
Countries: type
unique(rnaturalearth::ne_countries(returnclass ='sf')$sovereignt)
in the console
Regions: "Africa", "Antarctica", "Asia", "Oceania", "Europe", "Americas"
Continents: "Africa", "Antarctica", "Asia", "Europe", "North America", "Oceania", "South America"
a polygon:
Set bias
to "polygon"
, and provide your
polygon to area
.
an extent object:
Set bias
to "extent"
, and either provide your
extent object to bias.area
, or leave it NULL
to draw an extent
on the map.
Otherwise you can enter a raster of sampling probability. It can be useful
if you want to increase likelihood of samplings in areas of high
suitability (simply enter the suitability raster in weights; see examples
below),
or if you want to define sampling biases manually, e.g. to to create
biases along roads. In that case you have to provide to weights
a
raster layer in which each cell contains the probability to be sampled.
The .Random.seed
and RNGkind
are stored as
attributes
when the function is called, and can be used to
reproduce the results as shown in the examples (though
it is preferable to set the seed with set.seed
before calling
sampleOccurrences()
and to then use the same value in
set.seed
to reproduce results later. Note that
reproducing the sampling will only work if the same original distribution map
is used.
a list
with 8 elements:
type
: type of occurrence sampled (presence-absences or
presence-only)
sample.points
: data.frame containing the coordinates of
samples, true and sampled observations (i.e, 1, 0 or NA), and, if asked, the true
environmental suitability in sampled locations
detection.probability
: the chosen probability of detection of
the virtual species
error.probability
: the chosen probability to assign presence
in cells where the species is absent
bias
: if a bias was chosen, then the type of bias and the
associated area
will be included.
replacement
: indicates whether multiple samples could occur
in the same cells
original.distribution.raster
: the distribution raster from
which samples were drawn
sample.plot
: a recorded plot showing the sampled points
overlaying the original distribution.
Setting sample.prevalence
may at least partly
override bias
, e.g. if bias
is specified with extent
to
an area that contains no presences, but sample prevalence is set to > 0,
then cells outside of the biased sampling extent will be sampled until
the number of presences required by sample.prevalence
are obtained,
after which the sampling of absences will proceed according to the specified
bias.
Boris Leroy leroy.boris@gmail.com Willson Gaul wgaul@hotmail.com
with help from C. N. Meynard, C. Bellard & F. Courchamp
# Create an example stack with six environmental variables
a <- matrix(rep(dnorm(1:100, 50, sd = 25)),
nrow = 100, ncol = 100, byrow = TRUE)
env <- c(rast(a * dnorm(1:100, 50, sd = 25)),
rast(a * 1:100),
rast(a * logisticFun(1:100, alpha = 10, beta = 70)),
rast(t(a)),
rast(exp(a)),
rast(log(a)))
names(env) <- paste("Var", 1:6, sep = "")
# More than 6 variables: by default a PCA approach will be used
sp <- generateRandomSp(env, niche.breadth = "wide")
# Sampling of 25 presences
sampleOccurrences(sp, n = 25)
# Sampling of 30 presences and absences
sampleOccurrences(sp, n = 30, type = "presence-absence")
# Reducing of the probability of detection
sampleOccurrences(sp, n = 30, type = "presence-absence",
detection.probability = 0.5)
# Further reducing in relation to environmental suitability
sampleOccurrences(sp, n = 30, type = "presence-absence",
detection.probability = 0.5,
correct.by.suitability = TRUE)
# Creating sampling errors (far too much)
sampleOccurrences(sp, n = 30, type = "presence-absence",
error.probability = 0.5)
# Introducing a sampling bias (oversampling)
biased.area <- ext(1, 50, 1, 50)
sampleOccurrences(sp, n = 50, type = "presence-absence",
bias = "extent",
bias.area = biased.area)
# Showing the area in which the sampling is biased
plot(biased.area, add = TRUE)
# Introducing a sampling bias (no sampling at all in the chosen area)
biased.area <- ext(1, 50, 1, 50)
sampleOccurrences(sp, n = 50, type = "presence-absence",
bias = "extent",
bias.strength = 0,
bias.area = biased.area)
# Showing the area in which the sampling is biased
plot(biased.area, add = TRUE)
samps <- sampleOccurrences(sp, n = 50,
bias = "manual",
weights = sp$suitab.raster)
plot(sp$suitab.raster)
points(samps$sample.points[, c("x", "y")])
# Create a sampling bias so that more presences are sampled in areas with
# higher suitability
# Reproduce sampling based on the saved .Random.seed from a previous result
samps <- sampleOccurrences(sp, n = 100,
type = "presence-absence",
detection.probability = 0.7,
bias = "extent",
bias.strength = 50,
bias.area = biased.area)
# Reset the random seed using the value saved in the attributes
.Random.seed <- attr(samps, "seed")
reproduced_samps <- sampleOccurrences(sp, n = 100,
type = "presence-absence",
detection.probability = 0.7,
bias = "extent",
bias.strength = 50,
bias.area = biased.area)
identical(samps$sample.points, reproduced_samps$sample.points)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.