knitr::opts_chunk$set( eval = FALSE, collapse = TRUE, fig.width = 7, fig.height = 7, comment = "#>" )
In this vignette, I describe how spatial capture-recapture models can be used within an R6
object of the ScrModel
class. The folder inst/examples/ in the repository contains example code.
library(openpopscr) library(secr)
data("scrmodel_example")
The openpopscr
package uses the ScrData
class to store data. See the vignette on the ScrData
class to see how to create the data object from your own raw data. In this section, I simulate a SCR survey. The simulation function is useful when you want to try out a new method where you know what the correct parameter estimates are, you want to test how robust a method is to violations of its assumptions, or you want to see how data simulated from a model compare with real data.
To simulate a survey, I require true detection and density parameters, a detector layout, and a mesh.
# set true parameters true_par <- list(D = 1000, lambda0 = 0.5, sigma = 20) # make detectors array detectors <- make.grid(nx = 7, ny = 7, spacing = 20, detector = "count") # make mesh mesh <- make.mask(detectors, buffer = 100, nx = 64, ny = 64, type = "trapbuffer") # set number of occasions to simulate n_occasions <- 5
The simulation uses a hazard half-normal detection function as defined in Borchers and Efford (2008). This detection function has two parameters: $\lambda_0$ and $\sigma$. Together they define the encounter rate between an individual with activity centre a distance $r$ from a detector as $\lambda_0\exp(-\frac{r^2}{2\sigma^2})$. If the encounter rate is $e$, then $Te$ is the mean number of encounters one would expect in a time interval of duration $T$. This means that $\lambda_0$ is the encounter rate between an individual and a detector when the distance between the detector and the individual's activity centre is zero, it is a kind of intercept. The parameter $\sigma$ is the scale parameter that determines how quickly the encounter rate decreases as the distance between detector and activity centre increases.
I have specified the detectors to be of type "count". This means that each detector records how many times each individual was seen by that detector. An alternative detector type is "proximity"; these detectors only record whether an individual was seen at least once or not. Camera traps are examples of count detectors; they can be used to determine whether an individual was seen once, twice, thrice, and so on. DNA hair snares are examples of proximity detectors; one can only determine whether an individual's DNA is present or not, you cannot determine how many times the individual deposited hair. Multi-catch
detectors can also be used where individuals may only be detected on a single detector
for each occasion, but a detector can detect multiple individuals. openpopscr
does
not implement methods for traps that can detect only one individual per occasion, termed single catch traps.
To simulate the data, I set a random seed so that the results will be the same every time the code is run. This is so that you, the reader, can re-run the code and produce the same output as me. In practice, you may not want to set the seed if you want the output from the simulation to change randomly.
scrdat <- simulate_scr(true_par, n_occasions, detectors, mesh, seed = 15483)
You will notice the simulate_scr
function produces useful output to let the user know what is going on. If you want to switch this off, for example if you are running lots of simulations and do not want this output being produced over and over, then you can set print = FALSE
just as I have set the seed above.
In the simulated survey, r scrdat$n()
unique individual were detected at least once. More summary information can be found by typing the name of the object into the console:
scrdat
Models of class ScrModel
can be used to fit SCR Models, no surprises. To create a ScrModel, you need to specify a formula for each detection parameter and a starting value for all parameters, both as lists. It is best to choose starting values for the parameters that are biologically reasonable and
match your data. You could simulate some data using proposed starting values as true parameters to
see if they lead to similar data. Poor starting values can lead to spurious convergence
in complicated models, models with many parameters. The function get_start_value
can
be used to produce a reasonable guess for each parameter.
# set each parameter to be a constant form <- list(lambda0 ~ 1, sigma ~ 1, D ~ 1) # get some starting values based on data start <- get_start_values(scrdat)
# check starting values are reasonable, they seem ok
start
Before fitting the model, I create the model object:
mod <- ScrModel$new(form, scrdat, start)
Once the model object has been created, you can fit the model using the fit function:
mod$fit()
If you have the output printed, the "checking convergence" message reports
whether convergence has been reached and if it has not it reports the code
output by optim
that tells you why it failed to converge.
The current value of the parameters is stored inside the model object:
mod$par()
This is one benefit of using R6
objects; they can store
information as computations progress. For example, if I was fitting the model
using the fit()
and for some reason I wanted to stop the fitting
or the fitting failed because of convergence issues or computational errors; the
last value used by the numerical optimiser for each parameter is stored in par()
.
If I wanted to restart the fitting from this point again, I just need to call fit()
again. As you fit more complicated models, it is common for the optimisation to not
converge because it has reached the maximum number of iterations in the algorithm.
If this occurs, simply call fit()
again and the optimiser will
continue where it left off.
Once a model has been fitted, you can type its name into the console and some summary results are printed:
mod
The parameters are reported on the link scale; this is because when covariates
are involved, it is easier to interpret their effects. The package openpopscr
uses the log-link function for all of these parameters. So, you would take
the exponential of these reported numbers to obtain the parameter values on the
response scale.
Alternatively, you can use the get_par
function:
mod$get_par("lambda0", k = 1, j = 1) mod$get_par("sigma", k = 1, j = 1) mod$get_par("D")
I specify $k = 1, j = 1$ in the functions above for $\sigma, \lambda_0$ because by default
get_par
returns the value of these parameters for every occasion and detector. This is
useful when they vary, but here they are constants, so I only want
their value on a single occasion for a single detector.
If you want to access the reported results, they are stored as a list which is
returned by the function estimates
:
mod$estimates()
The model AIC can be computed in the usual way:
AIC(mod)
The detection parameters can change with observed covariate values. Covariates
can change by detector, by occasion, or by mesh point. Individual-level
covariates are not implemented in openpopscr
.
As an example, suppose the simulated survey was conducted with two types of detector: old and new. It is possible that the old detectors are less likely to detect an individual than a new detector. This can be captured by the model by allowing this covariate to change the value of $\lambda_0$, the encounter intercept.
# set seed so random covariate values will always be the same set.seed(58285) # Simulate, for each detector, an age class: old or new. In reality, this # covariate would be observed. age <- as.factor(sample(c("new", "old"), size = scrdat$n_traps(), replace = T)) # Add the covariate to the ScrData scrdat$add_covariate("age", age, "j")
When adding the covariate to the data, I specify the covariate is of type "j" because it is a detector-level covariate. See the vignette on the ScrData class for further information.
I can now fit the model with detector age included.
# copy formulae from previous model form_detage <- form # update formula for lambda0 to include age form_detage[[1]]<- lambda0 ~ age
# create new model object mod_detage <- ScrModel$new(form_detage, scrdat, start) # fit model mod_detage$fit()
# look at results mod_detage # get lambda0 for an old detector mod_detage$get_par("lambda0", k = 1, j = 1) # get lambda0 for a new detector mod_detage$get_par("lambda0", k = 1, j = 11)
I see here that the $95\%$ confidence interval for age covers zero and so there is no evidence that detector age has an effect on detectability.
Covariates that change over time, type $k$, and detector-temporal covariates can be included in the same way. See ScrData vignette for more details on loading covariates into the ScrData object.
Spatial covariates can be placed on density, but not on any other parameters. Here a simulate a true spatially-varying density surface with formula: (D = 0.2x^2 + 0.3y - 0.05xy) and try to recover this from fitting a model to simulated data. Notice that the "x" and "y" coordinates are scaled, meaning their mean is substracted and they are divided by their standard deviation: this is for numerical stability, but also affects the interpretation of the magnitude of the effects for x and y.
# simulate habitat covariate x <- scale(scrdat$covs()$x) y <- scale(scrdat$covs()$y) habitat <- 0.2 * x^2 + 0.3 * y - 0.05 * x * y # create true density surface logD <- log(true_par$D) + habitat ihp <- exp(logD) / true_par$D # ihp argument allows for spatially varying density scrdat_D <- simulate_scr(true_par, n_occasions, detectors, mesh, ihp = ihp, seed = 18528)
Now that the data are simulated, I fit the model. Notice, I now specify a formula for parameter D and use the built-in scaled x and y covariates. I will try two models, a misspeficied one and the true one:
form <- list(lambda0 ~ 1, sigma ~ 1, D ~ x*y) start_D <- get_start_values(scrdat_D) mod_D <- ScrModel$new(form, scrdat_D, start_D) mod_D$fit() form2 <- list(lambda0 ~ 1, sigma ~ 1, D ~ I(x^2) + y + x:y) mod_D2 <- ScrModel$new(form2, scrdat_D, start_D) mod_D2$fit()
mod_D
We can plot the estimated density surface:
# get density surface at each mesh point D <- mod_D2$get_par("D", m = 1:scrdat$n_meshpts()) * scrdat_D$cell_area() scrdat$plot_mesh(D)
Sometimes an individual's activity range within each occasion differs from their range when viewed across all occasions. There can be several reasons for this; two reasons are that the individual only explores part of their entire range during each occasion, so the range within each occasion is only partial; or, it may be that an individual does explore their entire range during each occasion, but the range shifts over time.
You can investigate this by looking at how the encounter range compares over occasions as opposed to within occasions:
# get encounter range statistic for occasion 1 scrdat$encrange(k = 1) # get encounter range for each occasion scrdat$encrange(each = TRUE) mean(scrdat$encrange(each = TRUE)) # You can get encounter range for any subset of occasions, either each or together scrdat$encrange(k = c(1, 3, 5)) scrdat$encrange(k = c(1, 3, 5), each = TRUE)
In this case, the mean range within each occasion is r round(mean(scrdat$encrange(each = TRUE)), 2)
and is similar to the range over all occasions r round(scrdat$encrange(), 2)
. So, for these data, transience does not appear to be a problem. This is unsurprising given the data were simulated with stationary activity ranges.
I will now simulate some data where activity centres move. In particular, the activity centre of each individual will move by Brownian motion with standard deviation $sd$.
# set truth true_par <- list(D = 1000, lambda0 = 2, sigma = 10, sd = 20) # make detectors array detectors <- make.grid(nx = 7, ny = 7, spacing = 20, detector = "count") # make mesh mesh <- make.mask(detectors, buffer = 100, nx = 64, ny = 64, type = "trapbuffer") # set number of occasions to simulate n_occasions <- 5 # simulate ScrData scrtransdat <- simulate_scr(true_par, n_occasions, detectors, mesh, move = TRUE, seed = 13854)
Notice the move = TRUE
argument to the simulate_scr
function and the specified true value of sd as r true_par$sd
.
I now imagine these data were collected in a survey and I want to investigate whether transience may be an issue using the encounter range statistic:
scrtransdat$encrange() mean(scrtransdat$encrange(each = TRUE))
There is a difference between the two measures of encounter range, where range across occasions is greater than range within, indicative of transience.
Nevertheless, I will first fit a stationary SCR model to these data:
# set formulae and start values stat_form <- list(lambda0 ~ 1, sigma ~ 1, D ~ 1) start_trans <- get_start_values(scrtransdat)
# create model object stat <- ScrModel$new(stat_form, scrtransdat, start_trans) # fit model stat$fit()
# look at results stat # look at parameters on response scale stat$get_par("lambda0", k = 1, j = 1) stat$get_par("sigma", k = 1, j = 1) stat$get_par("D")
Given we know the truth, the estimates are a little off: $\lambda_0$ is far too low, $\sigma$ is too high. The density parameter $D$ is acceptable but still a little large. In general, density estimation tends to be robust to transience as the biases in the detection parameters cancel out: $\lambda_0$ leads to density underestimation and $\sigma$ to density overestimation.
I now fit the transient model. The transient model class is called ScrTransientModel
. We must also now specify a start value
for the movement parameter.
# specify formulas for parameters form <- list(lambda0 ~ 1, sigma ~ 1, sd ~ 1, D ~ 1) start_trans <- get_start_values(scrtransdat, model = "ScrTransientModel")
trans <- ScrTransientModel$new(form, scrtransdat, start_trans)
Notice that you must specify model = "ScrTransientModel"
in
the get_start_values
function so that a start value for the movement
parameter is supplied. This function assumes by default that you are fitting
a ScrModel.
Also, note from the output given when creating the model object that a rectangular mesh is created based on the mesh supplied by the user. This is because the current algorithm that implements the movement model requires a rectangular study space. Note, however, that individuals may only reside and move around within the mesh supplied by the user; the extra mesh points created to produce a rectangular study space are not accessible. Also note, that the rectangular mesh will be larger and contain more mesh points than the user supplied mesh, so you may be required to reduce the resolution of the user supplied mesh in order to make computations with the rectangular mesh feasible.
I now fit the model as usual:
# fit model trans$fit()
# look at results trans # look at parameters on response scale trans$get_par("lambda0", k = 1, j = 1) trans$get_par("sigma", k = 1, j = 1) trans$get_par("sd", k = 1) trans$get_par("D")
The density estimation is not affected much by fitting a transience model, however, the value of the other parameters are affected; activity range in particular has been overestimated in the stationary model compared to the transient. Fitting the transience model has provided improved inference on activity.
AIC supports the use of the transience model.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.