simNmix: Simulate data for binomial and multinomial mixture models
In AHMbook: Functions and Data for the Book 'Applied Hierarchical Modeling in Ecology' Vols 1 and 2

View source: R/simNmix_AHM1_6-5_Simulate_binom-multinom_mixtures.R

simNmix

R Documentation

Simulate data for binomial and multinomial mixture models

Description

This very general function generates single-season count data under variants of the binomial N-mixture model of Royle (2004) and of the multinomial N-mixture model of Royle et al (2007).

Usage

simNmix(nsites = 267, nvisits = 3, mean.theta = 1, mean.lam = 2, mean.p = 0.6,
  area = FALSE, beta1.theta = 0, beta2.theta = 0, beta3.theta = 0,
  beta2.lam = 0, beta3.lam = 0, beta4.lam = 0, beta3.p = 0, beta5.p = 0,
  beta6.p = 0, beta.p.survey = 0, beta.p.N = 0, sigma.lam = 0, dispersion = 10,
  sigma.p.site = 0, sigma.p.visit = 0, sigma.p.survey = 0, sigma.p.ind = 0,
  Neg.Bin = FALSE, open.N = FALSE, show.plots = TRUE, verbose = TRUE)

Arguments

`nsites`	number of sites
`nvisits`	number of visits per site
`mean.theta`	proportion of sites that can have non-zero abundance in principle: suitability model for zero-inflation
`mean.lam`	Expected abundance at the average value of all abundance covariates (and ignoring random site effects): abundance model
`mean.p`	Expected detection at the average value of all detection covariates (and ignoring all random effects): detection model
`area`	determines area of sites (A), defaults to A=1 (i.e., all identical), but you can supply a vector of site areas of length nsites instead.
`beta1.theta`	coefficient of site covariate 1 in suitability model
`beta2.theta`	coefficient of site covariate 2 in suitability model
`beta3.theta`	coefficient of site covariate 3 in suitability model
`beta2.lam`	coefficient of site covariate 2 in abundance model
`beta3.lam`	coefficient of site covariate 3 in abundance model
`beta4.lam`	coefficient of site covariate 4 in abundance model
`beta3.p`	coefficient of site covariate 3 in detection model
`beta5.p`	coefficient of site covariate 5 in detection model
`beta6.p`	coefficient of site covariate 6 in detection model
`beta.p.survey`	coefficient of survey ('observational') covariate on p
`beta.p.N`	coefficient of centered local population size (log(N+1)) in detection model (i.e., coef. for density-dependent detection prob.)
`sigma.lam`	"Overdispersion SD" in linear predictor of abundance
`dispersion`	'size' or extra-Poisson dispersion of Negative binomial
`sigma.p.site`	"Overdispersion SD" in linear predictor of detection coming from random site effects
`sigma.p.visit`	"Overdispersion SD" in linear predictor of detection coming from random visit effects
`sigma.p.survey`	"Overdispersion SD" in linear predictor of detection coming from random site-by-survey effects
`sigma.p.ind`	"Overdispersion SD" in linear predictor of detection coming from random site-by-individual effects
`Neg.Bin`	if FALSE, any overdispersion in abundance is modeled by a Poisson log-normal; if TRUE, abundance overdispersion is modeled by adoption of a Negative binomial distribution for latent N
`open.N`	if TRUE, data are simulated under one specific form of an open population, where N in the first occasion is drawn from the specified mixture distribution and for all further occasions j, we have N_ij ~ Poisson(N_i(j-1)). With open.N = TRUE, we must have sigma.p.ind = 0, show.plot = FALSE and nvisits >1.
`show.plots`	if TRUE, plots of the data will be displayed; set to FALSE if you are running simulations.
`verbose`	if TRUE, output will be written to the console.

Details

Data are simulated at the level of each individual and individual-specific detection heterogeneity can be included. As a side-effect, individual-specific detection histories are generated and hence, data are also be simulated under the corresponding multinomial N-mixture model.

Broadly, the function can generate data under this most general model:

'Suitability' (zero-inflation) ~ cov1 + cov2 + cov3

Abundance ~ offset + cov2 + cov3 + cov4 + overdispersion

Detection ~ cov3 + cov5 + cov6 + survey.covariate + log(N+1) + eps.site + eps.visit + eps.survey + eps.individual

Overdispersion in abundance is modeled either as a Poisson-log-normal with a normal random site effect in lambda or with a Negative binomial with mean lambda and a 'size', or dispersion, parameter. Variable site areas can be specified to affect abundance as in an offset.

Abundance can be zero-inflated (this is the 'suitability' model). Note that the zero-inflation parameter is called theta here (in unmarked it is called psi). mean.phi is the probability that a site is suitable (i.e., 1 minus the expected proportion of sites with structural zero abundance.

Site covariate 2 can affect both suitability and abundance, while covariate 3 may affect all three levels. Hence, the function permits to simulate the case where a single site covariate affects different levels in the process (e.g., abundance and detection) in opposing directions (as for instance in Kéry, 2008)

Density-dependent detection can be modeled as a logistic-linear effect of local abundance (centered and log(x+1) transformed). Overdispersion in detection is modeled via normal random effects (the eps terms above) specific to sites, visits, surveys or individuals.

Effects of covariates and random-effects factors are modeled as additive on the link scale (log for abundance and logit for suitability and detection).

Data may be generated under one specific open-population model when argument 'open.N' is set to TRUE.

Value

A list with the arguments input and the following additional elements:

`nobs`	The total number of visits
`site.cov`	An nsites x 6 matrix of values for 6 site covariates
`survey.cov`	An nsites x nvisits matrix of values for a survey covariate
`log.lam`	Linear predictor of PLN abundance model including random effects, a vector of length nsites
`s`	Site suitability indicator, a vector of length nsites
`N`	Number of individuals at each site, a vector of length nsites
`p`	Probability of detection, an array with dimensions sites occupied x visits x max(N)
`DH`	Detection history (1/0), an array with dimensions sites occupied x visits x max(N)
`N.open`	Number of individuals at each site for open model, a sites x visits matrix
`C`	Summary of DH: number of individuals detected for each site and visit
`eta.lam`	Random site effects in lambda, a vector of length nsites, zero if Neg.Bin == TRUE
`eta.p.site`	Random site effects in p, a vector of length nsites
`eta.p.visit`	Random visit effects in p, a vector of length nvisits
`eta.p.survey`	Random survey (= site-by-survey) effects in p, a nsites x nvisits matrix
`eta.p.ind`	Random individual (= site-by-ind) effects in p (NOT site-ind-visit !), a nsites x max(N) matrix
`odcN`	Naive overdispersion measure (var/mean) for true abundance (N)
`odcC`	Naive overdispersion measure (var/mean) for observed counts (C)
`Ntotal`	Total abundance summed over all sites
`summax`	The sum of maximum counts over all sites

Author(s)

Marc Kéry & Andy Royle

References

Royle, J.A. (2004) N-mixture models for estimating population size from spatially replicated counts. Biometrics, 60, 108-115.

Royle, J.A., et al (2007) Hierarchical spatial models of abundance and occurrence from imperfect survey data. Ecological Monographs, 77, 465-481.

Kéry, M. (2008) Estimating abundance from bird counts: binomial mixture models uncover complex covariate relationships, Auk, 125, 336-345

Kéry, M. & Royle, J.A. (2016) Applied Hierarchical Modeling in Ecology AHM1 - 6.5.

Examples

# Generate data with the default arguments and look at the structure:
tmp <- simNmix()
str(tmp)


str(data <- simNmix())                  # Null data-generating model
str(data <- simNmix(mean.theta = 0.60)) # ZIP with 40% structural zeroes
str(data <- simNmix(sigma.lam = 1))     # Poisson-lognormal (PLN) mixture
str(data <- simNmix(Neg.Bin = TRUE))    # Negative-binomial mixture
str(data <- simNmix(mean.theta = 0.6, sigma.lam = 1))  # Zero-inflated PLN
str(data <- simNmix(mean.theta = 0.6, Neg.Bin = TRUE)) # Zero-infl. NegBin
str(data <- simNmix(mean.p = 1))        # Perfect detection (p = 1)
str(data <- simNmix(mean.theta = 0.6, mean.p = 1))     # ZIP with p = 1
str(data <- simNmix(sigma.lam = 1, mean.p = 1))        # PLN with p = 1

AHMbook documentation built on Sept. 12, 2024, 6:37 a.m.