simDataDK: Simulate data for an integrated species distribution model...

View source: R/simDataDK_AHM2_10.R

simDataDKR Documentation

Simulate data for an integrated species distribution model (SDM) of Dorazio-Koshkina

Description

The function generates a population represented as a point pattern in a heterogeneous landscape and simulates data from two different sources: (1) opportunistic presence-only data, and (2) replicate counts or detection/nondetection data in randomly-placed quadrats. This is the scenario for the integrated models described by Dorazio (2014) and Koshkina et al. (2017). The former assumes counts as data source (2) while the latter assume detection/nondetection data.

A Poisson point pattern (PPP) with intensity a function of a covariate X and intercept and coefficient beta is simulated on a discrete (pixel-based) approximation of a continuous landscape.

This PPP is first thinned with a pixel-wise thinning probability controlled by a covariate W and coefficients alpha, and second, with a landscape-wise random drop-out process to produce a first data set of presence-only kind.

A second data set is simulated by imagining replicated counts conducted in randomly-selected quadrats within the landscape. Detection of individuals is imperfect, with probability of detection controlled by the covariate W and coefficients gamma. These counts can be quantized to detection/nondetection data for use in a model as in Koshkina et al. (2017).

For simDataDK1 animals are limited to one individual per pixel; this is not the case for simDataDK.

To recreate the data sets used in the book with R 3.6.0 or later, include sample.kind="Rounding" in the call to set.seed. This should only be used for reproduction of old results.

Usage

simDataDK(sqrt.npix = 100, alpha = c(-1,-1), beta = c(6,0.5),
  drop.out.prop.pb = 0.7, quadrat.size = 4, gamma = c(0,-1.5),
  nquadrats = 250, nsurveys = 3, show.plot = TRUE)

simDataDK1(sqrt.npix = 100, alpha = c(-1,-1), beta = c(6,0.5),
  drop.out.prop.pb = 0.7, quadrat.size = 4, gamma = c(0,-1.5),
  nquadrats = 250, nsurveys = 3, show.plot = TRUE)

Arguments

sqrt.npix

number of pixels along each side of square state space (the 'landscape'); the total number of pixels is then npix = sqrt.npix^2. For simDataDK1 this limits the population size to 1 per pixel.

alpha

coefficients for the relationship: logit(b) = alpha[1] + alpha[2] * W, where b is the sampling detection bias in the presence-only observations.

beta

coefficients for the relationship: log(lambda) = beta[1] + beta[2] * X, where lambda is the intensity of the Poisson point process. If the values of beta result in very large numbers of animals, an error will occur.

drop.out.prop.pb

proportion of presence-only points at the end that are discarded.

quadrat.size

length of the side of quadrats for conducting replicate counts in pixel units; note that sqrt.npix / quadrat.size must yield an integer.

gamma

coefficients for the relationship: logit(p) = gamma[1] + gamma[2] * W, where p is the probability of detecting an individual during the count surveys in the quadrats.

nquadrats

the number of quadrats selected for the count survey.

nsurveys

the number of replicate counts in each quadrat.

show.plot

if TRUE, summary plots are displayed.

Value

A list with the values of the input arguments and the following additional elements:

npix

the number of pixels in the landscape

s.area

the area of the whole landscape = 4

s.loc

2-column vector with the location of each pixel

xcov

values of the 'X' (intensity) covariate

wcov

values of the 'W' (detection) covariate

N.ipp

true number of individuals in the landscape

pixel.id.ipp

pixel ID for each individual in the population

loc.ipp

coordinates for each individual in the population

pTrue.ipp

probability of detection for each individual for presence-only data

pixel.id.det

pixel ID for each individual detected opportunistically

N.det

number of detections

loc.det

coordinates of each individual detected opportunistically

pcount

probability of detection during count surveys, varies by quadrat

fullCountData

matrix with rows for each quadrat, columns for ID, x and w coords, true N, and 3 replicate counts

countData

as above, but rows for quadrats sampled only

s

a Raster Stack with layers for 'X', 'W', and number in each pixel, 'n'

squad

a Raster Stack corresponding to the quadrats, with mean 'X' and 'W' and true abundance, 'N'

Author(s)

Marc Kéry, Andy Royle & Mike Meredith, based on the code written by Dorazio (2014) and adapted by Koshkina et al. (2017).

References

Dorazio, R.M. (2014) Accounting for imperfect detection and survey bias in statistical analysis of presence-only data. Global Ecology and Biogeography, 23, 1472-1484.

Koshkina, V., Wang, Y., Gordon, A., Dorazio, R.M., White, M., & Stone, L. (2017) Integrated species distribution models: combining presence-background data and site-occupany data with imperfect detection. Methods in Ecology and Evolution, 8, 420-430.

Kéry, M. & Royle, J.A. (2021) Applied Hierarchical Modeling in Ecology AHM2 - 10.

Examples

# Run the function with default values and look at the output
str(tmp <- simDataDK(), 1)  # use str(., max.level=1) to limit the amount of output.

str(tmp <- simDataDK(show.plot=FALSE), 1)  # no plots

str(tmp <- simDataDK(sqrt.npix = 500), 1)  # much larger landscape

str(tmp <- simDataDK(alpha = c(-1,1)), 1)  # positive effect of W on bias rate parameter b

str(tmp <- simDataDK(beta = c(6, 0.5)), 1) # lower density

str(tmp <- simDataDK(drop.out.prop = 0), 1)# No final uniform thinning ("drop out")

str(tmp <- simDataDK(beta = c(6, 1)), 1)   # steeper gradient of habitat suitability

AHMbook documentation built on Aug. 24, 2023, 1:07 a.m.