likelihoods.fit: Computing Likelihoods for Occupancy Detection Models

Description Usage Arguments Details Value Functions References Examples

View source: R/likelihood.R

Description

Computing Likelihoods for Occupancy Detection Models

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
lppd.newdata(
  fit,
  Xocc,
  yXobs,
  ModelSite,
  chains = 1,
  numlvsims = 1000,
  cl = NULL
)

likelihoods.fit(
  fit,
  Xocc = NULL,
  yXobs = NULL,
  ModelSite = NULL,
  chains = NULL,
  numlvsims = 1000,
  cl = NULL
)

likelihood_joint_marginal.ModelSite(
  Xocc,
  Xobs,
  y,
  u.b_arr,
  v.b_arr,
  lv.coef_arr,
  lvsim
)

likelihood_joint_marginal.ModelSite.theta(
  Xocc,
  Xobs,
  y,
  u.b,
  v.b,
  lv.coef,
  lvsim
)

Arguments

Xocc

A matrix of occupancy covariates. Must have a single row. Columns correspond to covariates.

chains

is a vector indicator which mcmc chains to extract draws from. If NULL then all chains used.

numlvsims

the number of simulated latent variable values to use for computing likelihoods

cl

a cluster created by parallel::makeCluster()

Xobs

A matrix of detection covariates, each row is a visit.

y

A matrix of detection data for a given model site. 1 corresponds to detected. Each row is visit, each column is a species.

u.b_arr

Occupancy covariate loadings. Each row is a species, each column an occupancy covariate, each layer (dim = 3) is a draw

v.b_arr

Detection covariate loadings. Each row is a species, each column an detection covariate, each layer (dim = 3) is a draw

lv.coef_arr

LV loadings. Each row is a species, each column a LV, each layer (dim = 3) is a draw

lvsim

A matrix of simulated LV values. Columns correspond to latent variables, each row is a simulation

u.b

A vector of model parameters, labelled according to the BUGS labelling convention seen in runjags

v.b

Covariate loadings. Each row is a species, each column a detection covariate

lv.coef

Loadings for the latent variables. Each row is a species, each column corresponds to a LV.

data_i

A row of a data frame created by prep_data_by_modelsite. Each row contains data for a single ModelSite.

Details

Any predictinve accuracy measure requires a choice of

  1. the part of the model that is considered the 'likelihood' and

  2. factorisation of the likelihood into 'data points' Vehtari 2017

On 1: New data will look like a new location or visit for a new season in our exisitng region, and observing only the species included in the model. This means we have zero knowledge of the latent variable value at the new ModelSite. This means likelihood: * conditional on the covariates u.b and v.b (not using the fitted values of mu.u.b, tau.u.b etc) * is conditional on the lv.coef values of each species * is conditional on the latent variable value for (each) new ModelSite being drawn from a standard Gaussian distribution.

On 2: Factoring the likelihood using the inbuilt independence properties of the model means a single 'data point' is all the data for all visits of a single ModelSite. The likelihood could also be partitioned by each visit, but then data points are dependent (they have the same occupancy value).

The output of likelihoods.fit() can be easily passed to loo::waic() and loo::loo().

Value

lppd.newdata returns a list with components lpds: a list of the log likelihood of the observations for each ModelSite in the supplied data lppd: the computed log pointwise predictive density (sum of the lpds). This is equation (5) in Gelman et al 2014

likelihoods.fit returns a matrix. Each row corresponds to a draw of the parameters from the posterior. Each column to a ModelSite Compute the likelihoods of each ModelSite's observations given each draw of parameters in the posterior.

Functions

References

A. Vehtari, A. Gelman, and J. Gabry, "Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC," Stat Comput, vol. 27, pp. 1413-1432, Sep. 2017, doi: 10.1007/s11222-016-9696-4.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
# simulate data
covars <- simulate_covar_data(nsites = 50, nvisitspersite = 2)
y <- simulate_iid_detections(3, nrow(covars$Xocc))

fittedmodel <- run.detectionoccupancy(
  Xocc = covars$Xocc,
  yXobs = cbind(covars$Xobs, y),
  species = colnames(y),
  ModelSite = "ModelSite",
  OccFmla = "~ UpSite + Sine1",
  ObsFmla = "~ UpVisit + Step",
  nlv = 2,
  MCMCparams = list(n.chains = 1, adapt = 0, burnin = 0, sample = 3, thin = 1)
)

# run likelihood computations, waic, and psis-loo
insamplell <- likelihoods.fit(fittedmodel)
waic <- loo::waic(log(insamplell))
looest <- loo::loo(log(insamplell), cores = 2)



outofsample_covars <- simulate_covar_data(nsites = 10, nvisitspersite = 2)
outofsample_y <- simulate_iid_detections(3, nrow(outofsample_covars$Xocc))
outofsample_lppd <- lppd.newdata(fittedmodel,
             Xocc = outofsample_covars$Xocc,
             yXobs = cbind(outofsample_covars$Xobs, outofsample_y),
             ModelSite = "ModelSite")

# Recommend using multiple cores:
cl <- parallel::makeCluster(2)
insamplell <- likelihoods.fit(fittedmodel, cl = cl)

outofsample_lppd <- lppd.newdata(fittedmodel,
                                 Xocc = outofsample_covars$Xocc,
                                 yXobs = cbind(outofsample_covars$Xobs, outofsample_y),
                                 ModelSite = "ModelSite",
                                 cl = cl)
parallel::stopCluster(cl)

sustainablefarms/linking-data documentation built on Oct. 28, 2020, 2:41 a.m.