View source: R/PPC.residuals.R
| PPC.residuals | R Documentation |
Posterior Predictive Check (PPC) function that calculates Freeman-Tukey (FT) residuals, Pearson"s Chi-squared residuals, or deviance from MCMC output of occupancy models. This function compares observed data to simulated data from the posterior distribution to assess model fit.
PPC.residuals(
y,
p,
psi,
model = c("Occupancy", "RN"),
type = c("FT", "PearChi2", "Deviance"),
K = NULL,
z.cond = TRUE,
zhat = NULL,
nmax = 20,
return.residuals = TRUE,
return.z = TRUE
)
y |
Observations as either a site vector or site by occasion matrix. For matrix format, use NA for unsampled occasions. |
p |
Array of posterior samples for detection probability (p). Dimensions should be iterations by sites (by occasion optionally). For RN models, p should represent individual-level detection (not conditional on local abundance). |
psi |
Array of posterior samples for occupancy probability (psi). Dimensions should be iterations by sites. For RN models, psi should represent expected abundance |
model |
Character indicating model type: either "Occupancy" or "RN" (Royle-Nichols). |
type |
Type of residual to calculate: "FT" (Freeman Tukey), "PearChi2" (Pearson Chi-squared), or "Deviance" (not technically a residual). |
K |
Number of occasions as either a scalar or site vector. Calculated automatically if y is a matrix. |
z.cond |
Logical. If TRUE, new data is conditioned on estimated z (testing only detection model fit). If FALSE, generates new z for each posterior sample (testing complete model). |
zhat |
Optional matrix with same dimensions as psi containing estimates of z from the same model. If not provided, will be generated internally. |
nmax |
Maximum site-level abundance (default = 20). Only used if model="RN". Higher values increase computation time. Warning given if set too low. |
return.residuals |
Logical. If TRUE (default), returns residuals along with Bayesian p-value. |
return.z |
Logical. If TRUE, returns z values conditional on y, and unconditional
z's if |
This function helps assess model fit for occupancy models using various types of residuals:
Freeman-Tukey (FT):
R_j = (\sqrt{y_j} - \sqrt{E(y_j)})^2
Measures the squared difference between the square root of observed detections and the square root of expected detections at each site.
Pearson Chi-squared:
R_j = \left(\frac{y_j - E(y_j)}{\sqrt{Var(y_j)}}\right)^2
Measures the squared difference between observed and expected detections, standardized by the theoretical variance calculated from the model parameters.
Deviance:
R_j = -2\log[y_j|\theta_j, K_j]
Measures the contribution of each site to the overall model likelihood, quantifying the discrepancy between observed data and model predictions based on likelihood ratios
Where:
y_j is the number of detections of the species at site j, out of
K_j repeated surveys
E(y_j) = K_j p_j z_j, with p_j = species detection probability
and z_j = occupancy state (1 if occupied, 0 otherwise)
Var(y_j) = p_j z_j (1 - p_j z_j) K_j
For Royle-Nichols occupancy models, the term p_j z_j is replaced with
1 - (1 - r_j)^{N_j}, where r_j = individual detection probability
and N_j = local abundance
For Deviance, \theta_j is either occupancy and species detection probability
at site j (\psi_j, p_j) for regular occupancy models, or expected abundance
and individual detection probability (\lambda_j, r_j) for Royle-Nichols
occupancy models
The function calculates Bayesian p-values as a measure of model fit. These values:
Range from 0 to 1
Values close to 0.5 suggest good model fit
Values close to 0 or 1 suggest poor fit
Are calculated by comparing observed residuals to residuals from simulated data
The z.cond parameter allows for two types of model assessment:
z.cond = TRUE: Tests only the detection component of the model, fixing occupancy/abundance to estimates from the model, rather than generating them anew
z.cond = FALSE: Tests the complete model, including both occupancy and detection components
If return.residuals=TRUE (default), returns a list containing:
res.obs - residuals for observed data
res.new - residuals for newly generated data
BP - Bayesian p-value
If return.residuals=FALSE, returns only the Bayesian p-value.
This is a beta version of the function. While it has been tested extensively, not all possible data configurations may have been captured in testing. This is particularly true for:
Deviance calculations (type = "Deviance")
Royle-Nichols models (model = "RN")
If you encounter issues with the function, please contact the package developers.
FT and Chi-squared residuals have been extensively tested. Deviance calculations have undergone less testing and are only available for scenarios with constant detection probability across occasions. FT and Chi-squared residuals can handle varying detection probabilities.
Rahel Sollmann
Sollmann, Rahel. Occupancy models and the "good fit, bad prediction" dilemma. Ecology (submitted)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.