| stIntPGOcc | R Documentation | 
Function for fitting single-species multi-season spatial integrated occupancy models using Polya-Gamma latent variables. Data integration is done using a joint likelihood framework, assuming distinct detection models for each data source that are each conditional on a single latent occurrence process. Models are fit using Nearest Neighbor Gaussian Processes.
stIntPGOcc(occ.formula, det.formula, data, inits, priors, tuning, 
           cov.model = 'exponential', NNGP = TRUE, n.neighbors = 15, 
           search.type = 'cb', n.batch, batch.length, accept.rate = 0.43, 
           n.omp.threads = 1, verbose = TRUE, ar1 = FALSE, n.report = 100, 
           n.burn = round(.10 * n.batch * batch.length), n.thin = 1, n.chains = 1, ...)
| occ.formula | a symbolic description of the model to be fit for the occurrence portion of the model using R's model syntax. Only right-hand side of formula is specified. See example below. Random intercepts are allowed using lme4 syntax (Bates et al. 2015). | 
| det.formula | a list of symbolic descriptions of the models to be fit for the detection portion of the model using R's model syntax for each data set. Each element in the list is a formula for the detection model of a given data set. Only right-hand side of formula is specified. See example below. Random intercepts are allowed using lme4 syntax (Bates et al. 2015). | 
| data | a list containing data necessary for model fitting.
Valid tags are  | 
| inits | a list with each tag corresponding to a parameter name.
Valid tags are  | 
| priors | a list with each tag corresponding to a parameter name. 
Valid tags are  | 
| tuning | a list with each tag corresponding to a parameter
name. Valid tags are  | 
| cov.model | a quoted keyword that specifies the covariance
function used to model the spatial dependence structure among the
observations.  Supported covariance model key words are:
 | 
| NNGP | if  | 
| n.neighbors | number of neighbors used in the NNGP. Only used if 
 | 
| search.type | a quoted keyword that specifies the type of nearest
neighbor search algorithm. Supported method key words are:  | 
| n.batch | the number of MCMC batches in each chain to run for the Adaptive MCMC sampler. See Roberts and Rosenthal (2009) for details. | 
| batch.length | the length of each MCMC batch in each chain to run for the Adaptive MCMC sampler. See Roberts and Rosenthal (2009) for details. | 
| accept.rate | target acceptance rate for Adaptive MCMC. Default is 0.43. See Roberts and Rosenthal (2009) for details. | 
| n.omp.threads | a positive integer indicating the number of threads
to use for SMP parallel processing within chains. The package must be compiled for 
OpenMP support. For most Intel-based machines, we recommend setting 
 | 
| verbose | if  | 
| ar1 | logical value indicating whether to include an AR(1) zero-mean 
temporal random effect in the model. If  | 
| n.report | the interval to report MCMC progress. Note this is specified in terms of batches, not MCMC samples. | 
| n.burn | the number of samples out of the total  | 
| n.thin | the thinning interval for collection of MCMC samples. The
thinning occurs after the  | 
| n.chains | the number of chains to run in sequence. | 
| ... | currently no additional arguments | 
An object of class stIntPGOcc that is a list comprised of: 
| beta.samples | a  | 
| alpha.samples | a  | 
| z.samples | a three-dimensional array of posterior samples for the latent occupancy values, with dimensions corresponding to posterior sample, site, and primary time period. Note this object will contain predicted occupancy values for sites/primary time periods that were not sampled. | 
| psi.samples | a three-dimensional array of posterior samples for the latent occupancy probability values, with dimensions corresponding to posterior sample, site, and primary time period. Note this object will contain predicted occupancy probabilities for sites/primary time periods that were not sampled. | 
| sigma.sq.psi.samples | a  | 
| sigma.sq.p.samples | a  | 
| beta.star.samples | a  | 
| alpha.star.samples | a  | 
| theta.samples | a  | 
| w.samples | a  | 
| eta.samples | a  | 
| p.samples | a list of four-dimensional arrays consisting of the posterior samples of detection probability for each data source. For each data source, the dimensions of the four-dimensional array correspond to MCMC sample, site, season, and replicate within season. | 
| like.samples | a two-dimensional array of posterior samples for the likelihood values associated with each site and primary time period, for each individual data source. Used for calculating WAIC. | 
| rhat | a list of Gelman-Rubin diagnostic values for some of the model parameters. | 
| ESS | a list of effective sample sizes for some of the model parameters. | 
| run.time | execution time reported using  | 
The return object will include additional objects used for subsequent prediction and/or model fit evaluation.
Some of the underlying code used for generating random numbers from the Polya-Gamma distribution is taken from the pgdraw package written by Daniel F. Schmidt and Enes Makalic. Their code implements Algorithm 6 in PhD thesis of Jesse Bennett Windle (2013) https://repositories.lib.utexas.edu/handle/2152/21842.
Jeffrey W. Doser doserjef@msu.edu
Polson, N.G., J.G. Scott, and J. Windle. (2013) Bayesian Inference for Logistic Models Using Polya-Gamma Latent Variables. Journal of the American Statistical Association, 108:1339-1349.
Bates, Douglas, Martin Maechler, Ben Bolker, Steve Walker (2015). Fitting Linear Mixed-Effects Models Using lme4. Journal of Statistical Software, 67(1), 1-48. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.18637/jss.v067.i01")}.
set.seed(332)
# Simulate Data -----------------------------------------------------------
# Number of locations in each direction. This is the total region of interest
# where some sites may or may not have a data source.
J.x <- 15 
J.y <- 15
J.all <- J.x * J.y
# Number of data sources.
n.data <- 3
# Sites for each data source.
J.obs <- sample(ceiling(0.2 * J.all):ceiling(0.4 * J.all), n.data, replace = TRUE)
# Maximum number of years for each data set
n.time.max <- c(4, 8, 10)
# Number of years each site in each data set is sampled
n.time <- list()
for (i in 1:n.data) {
  n.time[[i]] <- sample(1:n.time.max[i], J.obs[i], replace = TRUE)
}
# Replicates for each data source.
n.rep <- list()
for (i in 1:n.data) {
  n.rep[[i]] <- matrix(NA, J.obs[i], n.time.max[i])
  for (j in 1:J.obs[i]) {
    n.rep[[i]][j, sample(1:n.time.max[i], n.time[[i]][j], replace = FALSE)] <- 
      sample(1:4, n.time[[i]][j], replace = TRUE)
  }
}
# Total number of years across all data sets
n.time.total <- 10
# List denoting the specific years each data set was sampled during. 
data.seasons <- list()
for (i in 1:n.data) {
  data.seasons[[i]] <- sort(sample(1:n.time.total, n.time.max[i], replace = FALSE))
}
# Occupancy covariates
beta <- c(0, 0.4, 0.3)
trend <- TRUE
# Random occupancy covariates
psi.RE <- list(levels = c(20),
               sigma.sq.psi = c(0.6))
p.occ <- length(beta)
# Detection covariates
alpha <- list()
alpha[[1]] <- c(0, 0.2, -0.5)
alpha[[2]] <- c(-1, 0.5, 0.3, -0.8)
alpha[[3]] <- c(-0.5, 1)
p.RE <- list()
p.det.long <- sapply(alpha, length)
p.det <- sum(p.det.long)
# Spatial parameters
sigma.sq <- 0.9
phi <- 3 / .5
# Simulate occupancy data.
dat <- simTIntOcc(n.data = n.data, J.x = J.x, J.y = J.y, J.obs = J.obs,
                  n.time = n.time, data.seasons = data.seasons, n.rep = n.rep,
                  beta = beta, alpha = alpha, trend = trend, 
                  psi.RE = psi.RE, p.RE = p.RE, sp = TRUE, 
                  sigma.sq = sigma.sq, phi = phi, cov.model = 'exponential')
y <- dat$y
X <- dat$X.obs
X.re <- dat$X.re.obs
X.p <- dat$X.p
sites <- dat$sites
coords <- dat$coords.obs
# Package all data into a list
occ.covs <- list(trend = X[, , 2], 
                 occ.cov.1 = X[, , 3], 
                 occ.factor.1 = X.re[, , 1])
det.covs <- list()
# Add covariates one by one
det.covs[[1]] <- list(det.cov.1.1 = X.p[[1]][, , , 2],
                      det.cov.1.2 = X.p[[1]][, , , 3])
det.covs[[2]] <- list(det.cov.2.1 = X.p[[2]][, , , 2],
                      det.cov.2.2 = X.p[[2]][, , , 3],
                      det.cov.2.3 = X.p[[2]][, , , 4])
det.covs[[3]] <- list(det.cov.3.1 = X.p[[3]][, , , 2])
data.list <- list(y = y, occ.covs = occ.covs, det.covs = det.covs,
                  sites = sites, seasons = data.seasons, coords = coords)
# Testing
occ.formula <- ~ trend + occ.cov.1 + (1 | occ.factor.1)
# Note that the names are not necessary.
det.formula <- list(f.1 = ~ det.cov.1.1 + det.cov.1.2,
                    f.2 = ~ det.cov.2.1 + det.cov.2.2 + det.cov.2.3,
                    f.3 = ~ det.cov.3.1)
# NOTE: this is a short run of the model, in reality we would run the 
#       model for much longer.
out <- stIntPGOcc(occ.formula = occ.formula,
                 det.formula = det.formula,
                 data = data.list,
                 NNGP = TRUE, 
                 n.neighbors = 15, 
                 cov.model = 'exponential',
                 n.batch = 3,
                 batch.length = 25, 
                 n.report = 1,
                 n.burn = 25,
                 n.thin = 1,
                 n.chains = 1)
summary(out)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.