# dDHMM: Dynamic Hidden Markov Model distribution for use in 'nimble'... In nimbleEcology: Distributions for Ecological Models in 'nimble'

## Description

`dDHMM` and `dDHMMo` provide Dynamic hidden Markov model distributions for `nimble` models.

## Usage

 ```1 2 3 4 5 6 7``` ```dDHMM(x, init, probObs, probTrans, len, checkRowSums = 1, log = 0) dDHMMo(x, init, probObs, probTrans, len, checkRowSums = 1, log = 0) rDHMM(n, init, probObs, probTrans, len, checkRowSums = 1) rDHMMo(n, init, probObs, probTrans, len, checkRowSums = 1) ```

## Arguments

 `x` vector of observations, each one a positive integer corresponding to an observation state (one value of which could can correspond to "not observed", and another value of which can correspond to "dead" or "removed from system"). `init` vector of initial state probabilities. Must sum to 1 `probObs` time-independent matrix (`dDHMM` and `rDHMM`) or time-dependent 3D array (`dDHMMo` and `rDHMMo`) of observation probabilities. First two dimensions of `probObs` are of size x (number of possible system states) x (number of possible observation classes). `dDHMMo` and `rDHMMo` expect an additional third dimension of size (number of observation times). probObs[i, j (,t)] is the probability that an individual in the ith latent state is recorded as being in the jth detection state (at time t). See Details for more information. `probTrans` time-dependent array of system state transition probabilities. Dimension of `probTrans` is (number of possible system states) x (number of possible system states) x (number of observation times). probTrans[i,j,t] is the probability that an individual truly in state i at time t will be in state j at time t+1. See Details for more information. `len` length of observations (needed for rDHMM) `checkRowSums` should validity of `probObs` and `probTrans` be checked? Both of these are required to have each set of probabilities sum to 1 (over each row, or second dimension). If `checkRowSums` is non-zero (or `TRUE`), these conditions will be checked within a tolerance of 1e-6. If it is 0 (or `FALSE`), they will not be checked. Not checking should result in faster execution, but whether that is appreciable will be case-specific. `log` `TRUE` or 1 to return log probability. `FALSE` or 0 to return probability `n` number of random draws, each returning a vector of length `len`. Currently only `n = 1` is supported, but the argument exists for standardization of "`r`" functions

## Details

These nimbleFunctions provide distributions that can be used directly in R or in `nimble` hierarchical models (via `nimbleCode` and `nimbleModel`).

The probability (or likelihood) of observation `x[t, o]` depends on the previous true latent state, the time-dependent probability of transitioning to a new state `probTrans`, and the probability of observation states given the true latent state `probObs`.

The distribution has two forms, `dDHMM` and `dDHMMo`. `dDHMM` takes a time-independent observation probability matrix with dimension S x O, while `dDHMMo` expects a three-dimensional array of time-dependent observation probabilities with dimension S x O x T, where O is the number of possible occupancy states, S is the number of true latent states, and T is the number of time intervals.

`probTrans` has dimension S x S x (T - 1). `probTrans`[i, j, t] is the probability that an individual in state `i` at time `t` takes on state `j` at time `t+1`. The length of the third dimension may be greater than (T - 1) but all values indexed greater than T - 1 will be ignored.

`init` has length S. `init[i]` is the probability of being in state `i` at the first observation time. That means that the first observations arise from the initial state probabilities.

For more explanation, see package vignette (`vignette("Introduction_to_nimbleEcology")`).

Compared to writing `nimble` models with a discrete true latent state and a separate scalar datum for each observation, use of these distributions allows one to directly sum (marginalize) over the discrete latent state and calculate the probability of all observations from one site jointly.

These are `nimbleFunction`s written in the format of user-defined distributions for NIMBLE's extension of the BUGS model language. More information can be found in the NIMBLE User Manual at https://r-nimble.org.

When using these distributions in a `nimble` model, the left-hand side will be used as `x`, and the user should not provide the `log` argument.

For example, in a NIMBLE model,

```observedStates[1:T] ~ dDHMM(initStates[1:S], observationProbs[1:S, 1:O], transitionProbs[1:S, 1:S, 1:(T-1)], 1, T)```

declares that the `observedStates[1:T]` vector follows a dynamic hidden Markov model distribution with parameters as indicated, assuming all the parameters have been declared elsewhere in the model. In this case, `S` is the number of system states, `O` is the number of observation classes, and `T` is the number of observation occasions.This will invoke (something like) the following call to `dDHMM` when `nimble` uses the model such as for MCMC:

```rDHMM(observedStates[1:T], initStates[1:S], observationProbs[1:S, 1:O], transitionProbs[1:S, 1:S, 1:(T-1)], 1, T, log = TRUE)```

If an algorithm using a `nimble` model with this declaration needs to generate a random draw for `observedStates[1:T]`, it will make a similar invocation of `rDHMM`, with `n = 1`.

If the observation probabilities are time-dependent, one would use:

```observedStates[1:T] ~ dDHMMo(initStates[1:S], observationProbs[1:S, 1:O, 1:T], transitionProbs[1:S, 1:S, 1:(T-1)], 1, T)```

## Value

For `dDHMM` and `dDHMMo`: the probability (or likelihood) or log probability of observation vector `x`. For `rDHMM` and `rDHMMo`: a simulated detection history, `x`.

## Author(s)

Perry de Valpine, Daniel Turek, and Ben Goldstein

## References

D. Turek, P. de Valpine and C. J. Paciorek. 2016. Efficient Markov chain Monte Carlo sampling for hierarchical hidden Markov models. Environmental and Ecological Statistics 23:549–564. DOI 10.1007/s10651-016-0353-z

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39``` ```# Set up constants and initial values for defining the model dat <- c(1,2,1,1) # A vector of observations init <- c(0.4, 0.2, 0.4) # A vector of initial state probabilities probObs <- t(array( # A matrix of observation probabilities c(1, 0, 0, 1, 0.8, 0.2), c(2, 3))) probTrans <- array(rep(1/3, 27), # A matrix of time-indexed transition probabilities c(3,3,3)) # Define code for a nimbleModel nc <- nimbleCode({ x[1:4] ~ dDHMM(init[1:3], probObs = probObs[1:3, 1:2], probTrans = probTrans[1:3, 1:3, 1:3], len = 4, checkRowSums = 1) for (i in 1:3) { init[i] ~ dunif(0,1) for (j in 1:3) { for (t in 1:3) { probTrans[i,j,t] ~ dunif(0,1) } } probObs[i, 1] ~ dunif(0,1) probObs[i, 2] <- 1 - probObs[i,1] } }) # Build the model, providing data and initial values DHMM_model <- nimbleModel(nc, data = list(x = dat), inits = list(init = init, probObs = probObs, probTrans = probTrans)) # Calculate log probability of x from the model DHMM_model\$calculate() # Use the model for a variety of other purposes... ```