#' Hidden Markov Model distribution for use in \code{nimble} models
#'
#' \code{dHMM} and \code{dHMMo} provide hidden Markov model distributions that
#' can be used directly from R or in \code{nimble} models.
#'
#' @name dHMM
#'
#' @aliases dHMM dHMMo rHMM rHMMo
#'
#' @author Ben Goldstein, Perry de Valpine, and Daniel Turek
#'
#' @param x vector of observations, each one a positive integer corresponding to
#' an observation state (one value of which could can correspond to "not
#' observed", and another value of which can correspond to "dead" or "removed
#' from system").
#' @param init vector of initial state probabilities. Must sum to 1
#' @param probObs time-independent matrix (\code{dHMM} and \code{rHMM}) or
#' time-dependent array (\code{dHMMo} and \code{rHMMo}) of observation
#' probabilities. First two dimensions of \code{probObs} are of size x (number
#' of possible system states) x (number of possible observation classes).
#' \code{dDHMMo} and \code{rDHMMo} expects an additional third dimension of
#' size (number of observation times). probObs[i, j (,t)] is the probability
#' that an individual in the ith latent state is recorded as being in the jth
#' detection state (at time t). See Details for more information.
#' @param probTrans time-independent matrix of state transition probabilities.
#' probTrans[i,j] is the probability that an individual in latent state i
#' transitions to latent state j at the next timestep. See Details for more
#' information.
#' @param len length of \code{x} (see below).
#' @param checkRowSums should validity of \code{probObs} and \code{probTrans} be
#' checked? Both of these are required to have each set of probabilities sum
#' to 1 (over each row, or second dimension). If \code{checkRowSums} is
#' non-zero (or \code{TRUE}), these conditions will be checked within a
#' tolerance of 1e-6. If it is 0 (or \code{FALSE}), they will not be checked.
#' Not checking should result in faster execution, but whether that is
#' appreciable will be case-specific.
#' @param log TRUE or 1 to return log probability. FALSE or 0 to return
#' probability.
#' @param n number of random draws, each returning a vector of length
#' \code{len}. Currently only \code{n = 1} is supported, but the argument
#' exists for standardization of "\code{r}" functions.
#'
#' @details These nimbleFunctions provide distributions that can be used
#' directly in R or in \code{nimble} hierarchical models (via
#' \code{\link[nimble]{nimbleCode}} and \code{\link[nimble]{nimbleModel}}).
#'
#' The distribution has two forms, \code{dHMM} and \code{dHMMo}. Define S as
#' the number of latent state categories (maximum possible value for elements
#' of \code{x}), O as the number of possible observation state categories, and
#' T as the number of observation times (length of \code{x}). In \code{dHMM},
#' \code{probObs} is a time-independent observation probability matrix with
#' dimension S x O. In \code{dHMMo}, \code{probObs} is a three-dimensional
#' array of time-dependent observation probabilities with dimension S x O x T.
#' The first index of \code{probObs} indexes the true latent state. The
#' second index of \code{probObs} indexes the observed state. For example, in
#' the time-dependent case, \code{probObs[i, j, t]} is the probability at time
#' \code{t} that an individual in state \code{i} is observed in state
#' \code{j}.
#'
#' \code{probTrans} has dimension S x S. \code{probTrans}[i, j] is the
#' time-independent probability that an individual in state \code{i} at time
#' \code{t} transitions to state \code{j} time \code{t+1}.
#'
#' \code{init} has length S. \code{init[i]} is the probability of being in
#' state \code{i} at the first observation time. That means that the first
#' observations arise from the initial state probabilities.
#'
#' For more explanation, see package vignette
#' (\code{vignette("Introduction_to_nimbleEcology")}).
#'
#' Compared to writing \code{nimble} models with a discrete latent state and a
#' separate scalar datum for each observation time, use of these distributions
#' allows one to directly sum (marginalize) over the discrete latent state and
#' calculate the probability of all observations for one individual (or other
#' HMM unit) jointly.
#'
#' These are \code{nimbleFunction}s written in the format of user-defined
#' distributions for NIMBLE's extension of the BUGS model language. More
#' information can be found in the NIMBLE User Manual at
#' \href{https://r-nimble.org}{https://r-nimble.org}.
#'
#' When using these distributions in a \code{nimble} model, the left-hand side
#' will be used as \code{x}, and the user should not provide the \code{log}
#' argument.
#'
#' For example, in \code{nimble} model code,
#'
#' \code{observedStates[i, 1:T] ~ dHMM(initStates[1:S], observationProbs[1:S,
#' 1:O], transitionProbs[1:S, 1:S], 1, T)}
#'
#' declares that the \code{observedStates[i, 1:T]} (observation history for
#' individual \code{i}, for example) vector follows a hidden Markov model
#' distribution with parameters as indicated, assuming all the parameters have
#' been declared elsewhere in the model. As above, \code{S} is the number of
#' system state categories, \code{O} is the number of observation state
#' categories, and \code{T} is the number of observation occasions. This will
#' invoke (something like) the following call to \code{dHMM} when
#' \code{nimble} uses the model such as for MCMC:
#'
#' \code{dHMM(observedStates[1:T], initStates[1:S], observationProbs[1:S,
#' 1:O], transitionProbs[1:S, 1:S], 1, T, log = TRUE)}
#'
#' If an algorithm using a \code{nimble} model with this declaration needs to
#' generate a random draw for \code{observedStates[1:T]}, it will make a
#' similar invocation of \code{rHMM}, with \code{n = 1}.
#'
#' If the observation probabilities are time-dependent, one would use:
#'
#' \code{observedStates[1:T] ~ dHMMo(initStates[1:S], observationProbs[1:S,
#' 1:O, 1:T], transitionProbs[1:S, 1:S], 1, T)}
#'
#' @return For \code{dHMM} and \code{dHMMo}: the probability (or likelihood) or
#' log probability of observation vector \code{x}.
#'
#' For \code{rHMM} and \code{rHMMo}: a simulated detection history, \code{x}.
#'
#' @seealso For dynamic hidden Markov models with time-dependent transitions,
#' see \code{\link{dDHMM}} and \code{\link{dDHMMo}}. For simple
#' capture-recapture, see \code{\link{dCJS}}.
#'
#' @references D. Turek, P. de Valpine and C. J. Paciorek. 2016. Efficient
#' Markov chain Monte Carlo sampling for hierarchical hidden Markov models.
#' Environmental and Ecological Statistics 23:549–564. DOI
#' 10.1007/s10651-016-0353-z
#'
#' @examples
#' # Set up constants and initial values for defining the model
#' len <- 5 # length of dataset
#' dat <- c(1,2,1,1,2) # A vector of observations
#' init <- c(0.4, 0.2, 0.4) # A vector of initial state probabilities
#' probObs <- t(array( # A matrix of observation probabilities
#' c(1, 0,
#' 0, 1,
#' 0.2, 0.8), c(2, 3)))
#' probTrans <- t(array( # A matrix of transition probabilities
#' c(0.6, 0.3, 0.1,
#' 0, 0.7, 0.3,
#' 0, 0, 1), c(3,3)))
#'
#' # Define code for a nimbleModel
#' nc <- nimbleCode({
#' x[1:5] ~ dHMM(init[1:3], probObs = probObs[1:3,1:2],
#' probTrans = probTrans[1:3, 1:3], len = 5, checkRowSums = 1)
#'
#' for (i in 1:3) {
#' for (j in 1:3) {
#' probTrans[i,j] ~ dunif(0,1)
#' }
#'
#' probObs[i, 1] ~ dunif(0,1)
#' probObs[i, 2] <- 1 - probObs[i, 1]
#' }
#' })
#'
#' # Build the model
#' HMM_model <- nimbleModel(nc,
#' data = list(x = dat),
#' inits = list(init = init,
#' probObs = probObs,
#' probTrans = probTrans))
#' # Calculate log probability of data from the model
#' HMM_model$calculate()
#' # Use the model for a variety of other purposes...
NULL
#' @export
#' @rdname dHMM
dHMM <- nimbleFunction(
run = function(x = double(1), ## Observed capture (state) history
init = double(1),
probObs = double(2),
probTrans = double(2),
len = double(0, default = 0),## length of x (needed as a separate param for rDHMM)
checkRowSums = double(0, default = 1),
log = integer(0, default = 0)) {
if (length(x) != len) stop("In dHMM: Argument len must be length of x or 0.")
if (dim(probObs)[1] != dim(probTrans)[1]) stop("In dHMM: Length of dimension 1 in probObs must equal length of dimension 1 in probTrans.")
if (dim(probTrans)[1] != dim(probTrans)[2]) stop("In dHMM: probTrans must be a square matrix.")
if (abs(sum(init) - 1) > 1e-6) stop("In dHMM: Initial probabilities must sum to 1.")
if (checkRowSums) {
transCheckPasses <- TRUE
for (i in 1:dim(probTrans)[1]) {
thisCheckSum <- sum(probTrans[i,])
if (abs(thisCheckSum - 1) > 1e-6) {
## Compilation doesn't support more than a simple string for stop()
## so we provide more detail using a print().
print("In dHMM: Problem with sum(probTrans[i,]) with i = ", i, ". The sum should be 1 but is ", thisCheckSum)
transCheckPasses <- FALSE
}
}
obsCheckPasses <- TRUE
for (i in 1:dim(probObs)[1]) {
thisCheckSum <- sum(probObs[i,])
if (abs(thisCheckSum - 1) > 1e-6) {
print("In dHMM: Problem with sum(probObs[i,]) with i = ", i, ". The sum should be 1 but is ", thisCheckSum)
obsCheckPasses <- FALSE
}
}
if(!(transCheckPasses | obsCheckPasses))
stop("In dHMM: probTrans and probObs were not specified correctly. Probabilities in each row (second dimension) must sum to 1.")
if(!transCheckPasses)
stop("In dHMM: probTrans was not specified correctly. Probabilities in each row (second dimension) must sum to 1.")
if(!obsCheckPasses)
stop("In dHMM: probObs was not specified correctly. Probabilities in each row must sum to 1.")
}
pi <- init # State probabilities at time t=1
logL <- 0
nObsClasses <- dim(probObs)[2]
for (t in 1:len) {
if (x[t] > nObsClasses | x[t] < 1) stop("In dHMM: Invalid value of x[t].")
Zpi <- probObs[, x[t]] * pi # Vector of P(state) * P(observation class x[t] | state)
sumZpi <- sum(Zpi) # Total P(observed as class x[t])
logL <- logL + log(sumZpi) # Accumulate log probabilities through time
if (t != len) pi <- ((Zpi %*% probTrans) / sumZpi)[1, ] # State probabilities at t+1
}
returnType(double())
if (log) return(logL)
return(exp(logL))
}
)
#' @export
#' @rdname dHMM
dHMMo <- nimbleFunction(
run = function(x = double(1), ## Observed capture (state) history
init = double(1),##
probObs = double(3),
probTrans = double(2),
len = double(0, default = 0),## length of x (needed as a separate param for rDHMM)
checkRowSums = double(0, default = 1),
log = integer(0, default = 0)) {
if (length(x) != len) stop("In dHMMo: Argument len must be length of x or 0.")
if (dim(probObs)[1] != dim(probTrans)[1]) stop("In dHMMo: In dHMM: Length of dimension 1 in probObs must equal length of dimension 1 in probTrans.")
if (dim(probTrans)[1] != dim(probTrans)[2]) stop("In dHMMo: probTrans must be a square matrix.")
if (dim(probObs)[3] != len) {
if (dim(probObs)[3] == 1) stop("In dHMMo: Time dimension of probObs must match length of data. Did you mean dHMM?")
stop("In dHMMo: Length of time dimension of probObs must match length of data.")
}
if (abs(sum(init) - 1) > 1e-6) stop("In dHMMo: Initial probabilities must sum to 1.")
if (checkRowSums) {
transCheckPasses <- TRUE
for (i in 1:dim(probTrans)[1]) {
thisCheckSum <- sum(probTrans[i,])
if (abs(thisCheckSum - 1) > 1e-6) {
## Compilation doesn't support more than a simple string for stop()
## so we provide more detail using a print().
print("In dHMMo: Problem with sum(probTrans[i,]) with i = ", i, ". The sum should be 1 but is ", thisCheckSum)
transCheckPasses <- FALSE
}
}
obsCheckPasses <- TRUE
for (i in 1:dim(probObs)[1]) {
for (k in 1:dim(probObs)[3]) {
thisCheckSum <- sum(probObs[i,,k])
if (abs(thisCheckSum - 1) > 1e-6) {
print("In dHMMo: Problem with sum(probObs[i,,k]) with i = ", i, " k = " , k, ". The sum should be 1 but is ", thisCheckSum)
obsCheckPasses <- FALSE
}
}
}
if(!(transCheckPasses | obsCheckPasses))
stop("In dHMMo: probTrans and probObs were not specified correctly. Probabilities in each row (second dimension) must sum to 1.")
if(!transCheckPasses)
stop("In dHMMo: probTrans was not specified correctly. Probabilities in each row (second dimension) must sum to 1.")
if(!obsCheckPasses)
stop("In dHMMo: probObs was not specified correctly. Probabilities in each row must sum to 1.")
}
pi <- init # State probabilities at time t=1
logL <- 0
nObsClasses <- dim(probObs)[2]
for (t in 1:len) {
if (x[t] > nObsClasses | x[t] < 1) stop("In dHMMo: Invalid value of x[t].")
Zpi <- probObs[,x[t],t] * pi # Vector of P(state) * P(observation class x[t] | state)
sumZpi <- sum(Zpi) # Total P(observed as class x[t])
logL <- logL + log(sumZpi) # Accumulate log probabilities through timeÍ
if (t != len) pi <- ((Zpi %*% probTrans) / sumZpi)[1, ] # State probabilities at t+1
}
returnType(double())
if (log) return(logL)
return(exp(logL))
}
)
#' @export
#' @rdname dHMM
rHMM <- nimbleFunction(
run = function(n = integer(), ## Observed capture (state) history
init = double(1),
probObs = double(2),
probTrans = double(2),
len = double(0, default = 0),
checkRowSums = double(0, default = 1)) {
returnType(double(1))
if (dim(probObs)[1] != dim(probTrans)[1]) stop("In rHMM: Number of cols in probObs must equal number of cols in probTrans.")
if (dim(probTrans)[1] != dim(probTrans)[2]) stop("In rHMM: probTrans must be a square matrix.")
if (abs(sum(init) - 1) > 1e-6) stop("In rHMM: Initial probabilities must sum to 1.")
if (checkRowSums) {
transCheckPasses <- TRUE
for (i in 1:dim(probTrans)[1]) {
thisCheckSum <- sum(probTrans[i,])
if (abs(thisCheckSum - 1) > 1e-6) {
## Compilation doesn't support more than a simple string for stop()
## so we provide more detail using a print().
print("In rHMM: Problem with sum(probTrans[i,]) with i = ", i, ". The sum should be 1 but is ", thisCheckSum)
transCheckPasses <- FALSE
}
}
obsCheckPasses <- TRUE
for (i in 1:dim(probObs)[1]) {
thisCheckSum <- sum(probObs[i,])
if (abs(thisCheckSum - 1) > 1e-6) {
print("In rHMM: Problem with sum(probObs[i,]) with i = ", i, ". The sum should be 1 but is ", thisCheckSum)
obsCheckPasses <- FALSE
}
}
if(!(transCheckPasses | obsCheckPasses))
stop("In rHMM: probTrans and probObs were not specified correctly. Probabilities in each row (second dimension) must sum to 1.")
if(!transCheckPasses)
stop("In rHMM: probTrans was not specified correctly. Probabilities in each row (second dimension) must sum to 1.")
if(!obsCheckPasses)
stop("In rHMM: probObs was not specified correctly. Probabilities in each row must sum to 1.")
}
ans <- numeric(len)
probInit <- init
trueInit <- 0
r <- runif(1, 0, 1)
j <- 1
while (r > sum(probInit[1:j])) j <- j + 1
trueInit <- j
trueState <- trueInit
### QUESTION: Is the "init" probability for the state at time t1 or t0? I'm assuming t0
for (i in 1:len) {
# Transition to a new true state
r <- runif(1, 0, 1)
j <- 1
while (r > sum(probTrans[trueState, 1:j])) j <- j + 1
trueState <- j
# Detect based on the true state
r <- runif(1, 0, 1)
j <- 1
while (r > sum(probObs[trueState, 1:j])) j <- j + 1
ans[i] <- j
}
return(ans)
})
#' @export
#' @rdname dHMM
rHMMo <- nimbleFunction(
run = function(n = integer(), ## Observed capture (state) history
init = double(1),
probObs = double(3),
probTrans = double(2),
len = double(0, default = 0),
checkRowSums = double(0, default = 1)) {
returnType(double(1))
if (dim(probObs)[1] != dim(probTrans)[1]) stop("In rHMMo: Number of cols in probObs must equal number of cols in probTrans.")
if (dim(probTrans)[1] != dim(probTrans)[2]) stop("In rHMMo: probTrans must be a square matrix.")
if (dim(probObs)[3] != len) {
if (dim(probObs)[3] == 1) stop("In rHMMo: Time dimension of probObs must match length of data. Did you mean rHMM?")
stop("In rHMMo: Length of time dimension of probObs must match length of data.")
}
if (abs(sum(init) - 1) > 1e-6) stop("In rHMMo: Initial probabilities must sum to 1.")
if (checkRowSums) {
transCheckPasses <- TRUE
for (i in 1:dim(probTrans)[1]) {
thisCheckSum <- sum(probTrans[i,])
if (abs(thisCheckSum - 1) > 1e-6) {
## Compilation doesn't support more than a simple string for stop()
## so we provide more detail using a print().
print("In rHMMo: Problem with sum(probTrans[i,]) with i = ", i, ". The sum should be 1 but is ", thisCheckSum)
transCheckPasses <- FALSE
}
}
obsCheckPasses <- TRUE
for (i in 1:dim(probObs)[1]) {
for (k in 1:dim(probObs)[3]) {
thisCheckSum <- sum(probObs[i,,k])
if (abs(thisCheckSum - 1) > 1e-6) {
print("In rHMMo: Problem with sum(probObs[i,,k]) with i = ", i, " k = ", k, ". The sum should be 1 but is ", thisCheckSum)
obsCheckPasses <- FALSE
}
}
}
if(!(transCheckPasses | obsCheckPasses))
stop("In rHMMo: probTrans and probObs were not specified correctly. Probabilities in each row (second dimension) must sum to 1.")
if(!transCheckPasses)
stop("In rHMMo: probTrans was not specified correctly. Probabilities in each row (second dimension) must sum to 1.")
if(!obsCheckPasses)
stop("In rHMMo: probObs was not specified correctly. Probabilities in each row must sum to 1.")
}
ans <- numeric(len)
probInit <- init
trueInit <- 0
r <- runif(1, 0, 1)
j <- 1
while (r > sum(probInit[1:j])) j <- j + 1
trueInit <- j
trueState <- trueInit
for (i in 1:len) {
# Transition to a new true state
r <- runif(1, 0, 1)
j <- 1
while (r > sum(probTrans[trueState, 1:j])) j <- j + 1
trueState <- j
# Detect based on the true state
r <- runif(1, 0, 1)
j <- 1
while (r > sum(probObs[trueState, 1:j, i])) j <- j + 1
ans[i] <- j
}
return(ans)
})
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.