# dHMM: Hidden Markov Model distribution for use in 'nimble' models In nimbleEcology: Distributions for Ecological Models in 'nimble'

## Description

`dHMM` and `dHMMo` provide hidden Markov model distributions that can be used directly from R or in `nimble` models.

## Usage

 ```1 2 3 4 5 6 7``` ```dHMM(x, init, probObs, probTrans, len = 0, checkRowSums = 1, log = 0) dHMMo(x, init, probObs, probTrans, len = 0, checkRowSums = 1, log = 0) rHMM(n, init, probObs, probTrans, len = 0, checkRowSums = 1) rHMMo(n, init, probObs, probTrans, len = 0, checkRowSums = 1) ```

## Arguments

 `x` vector of observations, each one a positive integer corresponding to an observation state (one value of which could can correspond to "not observed", and another value of which can correspond to "dead" or "removed from system"). `init` vector of initial state probabilities. Must sum to 1 `probObs` time-independent matrix (`dHMM` and `rHMM`) or time-dependent array (`dHMMo` and `rHMMo`) of observation probabilities. First two dimensions of `probObs` are of size x (number of possible system states) x (number of possible observation classes). `dDHMMo` and `rDHMMo` expects an additional third dimension of size (number of observation times). probObs[i, j (,t)] is the probability that an individual in the ith latent state is recorded as being in the jth detection state (at time t). See Details for more information. `probTrans` time-independent matrix of state transition probabilities. probTrans[i,j] is the probability that an individual in latent state i transitions to latent state j at the next timestep. See Details for more information. `len` length of `x` (see below). `checkRowSums` should validity of `probObs` and `probTrans` be checked? Both of these are required to have each set of probabilities sum to 1 (over each row, or second dimension). If `checkRowSums` is non-zero (or `TRUE`), these conditions will be checked within a tolerance of 1e-6. If it is 0 (or `FALSE`), they will not be checked. Not checking should result in faster execution, but whether that is appreciable will be case-specific. `log` TRUE or 1 to return log probability. FALSE or 0 to return probability. `n` number of random draws, each returning a vector of length `len`. Currently only `n = 1` is supported, but the argument exists for standardization of "`r`" functions.

## Details

These nimbleFunctions provide distributions that can be used directly in R or in `nimble` hierarchical models (via `nimbleCode` and `nimbleModel`).

The distribution has two forms, `dHMM` and `dHMMo`. Define S as the number of latent state categories (maximum possible value for elements of `x`), O as the number of possible observation state categories, and T as the number of observation times (length of `x`). In `dHMM`, `probObs` is a time-independent observation probability matrix with dimension S x O. In `dHMMo`, `probObs` is a three-dimensional array of time-dependent observation probabilities with dimension S x O x T. The first index of `probObs` indexes the true latent state. The second index of `probObs` indexes the observed state. For example, in the time-dependent case, `probObs[i, j, t]` is the probability at time `t` that an individual in state `i` is observed in state `j`.

`probTrans` has dimension S x S. `probTrans`[i, j] is the time-independent probability that an individual in state `i` at time `t` transitions to state `j` time `t+1`.

`init` has length S. `init[i]` is the probability of being in state `i` at the first observation time. That means that the first observations arise from the initial state probabilities.

For more explanation, see package vignette (`vignette("Introduction_to_nimbleEcology")`).

Compared to writing `nimble` models with a discrete latent state and a separate scalar datum for each observation time, use of these distributions allows one to directly sum (marginalize) over the discrete latent state and calculate the probability of all observations for one individual (or other HMM unit) jointly.

These are `nimbleFunction`s written in the format of user-defined distributions for NIMBLE's extension of the BUGS model language. More information can be found in the NIMBLE User Manual at https://r-nimble.org.

When using these distributions in a `nimble` model, the left-hand side will be used as `x`, and the user should not provide the `log` argument.

For example, in `nimble` model code,

```observedStates[i, 1:T] ~ dHMM(initStates[1:S], observationProbs[1:S, 1:O], transitionProbs[1:S, 1:S], 1, T)```

declares that the `observedStates[i, 1:T]` (observation history for individual `i`, for example) vector follows a hidden Markov model distribution with parameters as indicated, assuming all the parameters have been declared elsewhere in the model. As above, `S` is the number of system state categories, `O` is the number of observation state categories, and `T` is the number of observation occasions. This will invoke (something like) the following call to `dHMM` when `nimble` uses the model such as for MCMC:

```dHMM(observedStates[1:T], initStates[1:S], observationProbs[1:S, 1:O], transitionProbs[1:S, 1:S], 1, T, log = TRUE)```

If an algorithm using a `nimble` model with this declaration needs to generate a random draw for `observedStates[1:T]`, it will make a similar invocation of `rHMM`, with `n = 1`.

If the observation probabilities are time-dependent, one would use:

```observedStates[1:T] ~ dHMMo(initStates[1:S], observationProbs[1:S, 1:O, 1:T], transitionProbs[1:S, 1:S], 1, T)```

## Value

For `dHMM` and `dHMMo`: the probability (or likelihood) or log probability of observation vector `x`.

For `rHMM` and `rHMMo`: a simulated detection history, `x`.

## Author(s)

Ben Goldstein, Perry de Valpine, and Daniel Turek

## References

D. Turek, P. de Valpine and C. J. Paciorek. 2016. Efficient Markov chain Monte Carlo sampling for hierarchical hidden Markov models. Environmental and Ecological Statistics 23:549–564. DOI 10.1007/s10651-016-0353-z

For dynamic hidden Markov models with time-dependent transitions, see `dDHMM` and `dDHMMo`. For simple capture-recapture, see `dCJS`.
 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37``` ```# Set up constants and initial values for defining the model len <- 5 # length of dataset dat <- c(1,2,1,1,2) # A vector of observations init <- c(0.4, 0.2, 0.4) # A vector of initial state probabilities probObs <- t(array( # A matrix of observation probabilities c(1, 0, 0, 1, 0.2, 0.8), c(2, 3))) probTrans <- t(array( # A matrix of transition probabilities c(0.6, 0.3, 0.1, 0, 0.7, 0.3, 0, 0, 1), c(3,3))) # Define code for a nimbleModel nc <- nimbleCode({ x[1:5] ~ dHMM(init[1:3], probObs = probObs[1:3,1:2], probTrans = probTrans[1:3, 1:3], len = 5, checkRowSums = 1) for (i in 1:3) { for (j in 1:3) { probTrans[i,j] ~ dunif(0,1) } probObs[i, 1] ~ dunif(0,1) probObs[i, 2] <- 1 - probObs[i, 1] } }) # Build the model HMM_model <- nimbleModel(nc, data = list(x = dat), inits = list(init = init, probObs = probObs, probTrans = probTrans)) # Calculate log probability of data from the model HMM_model\$calculate() # Use the model for a variety of other purposes... ```