prior_emiss_cat: Specifying informative hyper-prior on the categorical...
In mHMMbayes: Multilevel Hidden Markov Models Using Bayesian Estimation

prior_emiss_cat

R Documentation

Specifying informative hyper-prior on the categorical emission distribution(s) of the multilevel hidden Markov model

Description

prior_emiss_cat provides a framework to manually specify an informative hyper-prior on the categorical emission distribution(s). prior_emiss_cat creates an object of class mHMM_prior_emiss used by the function mHMM, and additionally attaches the class cat to signal use for categorical observations. Note that the hyper-prior distribution on the categorical emission probabilities are on the intercepts (and, if subject level covariates are used, regression coefficients) of the Multinomial logit model used to accommodate the multilevel framework of the data, instead of on the probabilities directly. The set of hyper-prior distributions consists of a multivariate Normal hyper-prior distribution on the vector of means (i.e., intercepts and regression coefficients), and an Inverse Wishart hyper-prior distribution on the covariance matrix.

Usage

prior_emiss_cat(
  gen,
  emiss_mu0,
  emiss_K0 = NULL,
  emiss_V = NULL,
  emiss_nu = NULL,
  n_xx_emiss = NULL
)

Arguments

`gen`	List containing the following elements denoting the general model properties: `m`: numeric vector with length 1 denoting the number of hidden states `n_dep`: numeric vector with length 1 denoting the number of dependent variables `q_emiss`: only to be specified if the data represents categorical data. Numeric vector with length `n_dep` denoting the number of observed categories for the categorical emission distribution for each of the dependent variables.
`emiss_mu0`	A list of lists: `emiss_mu0` contains `n_dep` lists, i.e., one list for each dependent variable `k`. Each list `k` contains `m` matrices; one matrix for each set of emission probabilities within a state. The matrices contain the hypothesized hyper-prior mean values of the intercepts of the Multinomial logit model on the categorical emission probabilities. Hence, each matrix consists of one row (when not including covariates in the model) and `q_emiss[k]` - 1 columns. If covariates are used, the number of rows in each matrix in the list is equal to 1 + n_xx (i.e., the first row corresponds to the hyper-prior mean values of the intercepts, the subsequent rows correspond to the hyper-prior mean values of the regression coefficients connected to each of the covariates).
`emiss_K0`	Optional list containing `n_dep` elements corresponding to each dependent variable `k`. Each element `k` is a numeric vector with length 1 (when no covariates are used) denoting the number of hypothetical prior subjects on which the set of hyper-prior mean intercepts specified in `emiss_mu0` are based. When covariates are used: each element is a numeric vector with length 1 + n_xx denoting the number of hypothetical prior subjects on which the set of intercepts (first value) and set of regression coefficients (subsequent values) are based.
`emiss_V`	Optional list containing `n_dep` elements corresponding to each dependent variable `k`, where each element `k` is a matrix of `q_emiss[k]` - 1 by `q_emiss[k]` - 1 containing the variance-covariance of the hyper-prior Inverse Wishart distribution on the covariance of the Multinomial logit intercepts.
`emiss_nu`	Optional list containing `n_dep` elements corresponding to each dependent variable `k`. Each element `k` is a numeric vector with length 1 denoting the degrees of freedom of the hyper-prior Inverse Wishart distribution on the covariance of the Multinomial logit intercepts.
`n_xx_emiss`	Optional numeric vector with length `n_dep` denoting the number of (level 2) covariates used to predict the emission distribution of each of the dependent variables `k`. When omitted, the model assumes no covariates are used to predict the emission distribution(s).

Details

Estimation of the mHMM proceeds within a Bayesian context, hence a hyper-prior distribution has to be defined for the group level parameters. Default, non-informative priors are used unless specified otherwise by the user. For each dependent variable, each row of the categorical emission probability matrix (i.e., the probability to observe each category (columns) within each of the states (rows)) has its own set of Multinomial logit intercepts, which are assumed to follow a multivariate normal distribution. Hence, the hyper-prior distributions for the intercepts consists of a multivariate Normal hyper-prior distribution on the vector of means, and an Inverse Wishart hyper-prior distribution on the covariance matrix. Note that only the general model properties (number of states m, number of dependent variables n_dep, and number of observed categories for each dependent variable q_emiss) and values of the hypothesized hyper-prior mean values of the Multinomial logit intercepts have to be specified by the user, default values are available for all other hyper-prior distribution parameters.

Given that the hyper-priors are specified on the intercepts of the Multinomial logit model intercepts instead of on the categorical emission probabilities directly, specifying a hyper-prior can seem rather daunting. However, see the function prob_to_int and int_to_prob for translating probabilities to a set of Multinomial logit intercepts and vice versa.

Note that emiss_K0, emiss_nu and emiss_V are assumed equal over the states. When the hyper-prior values for emiss_K0, emiss_nu and emiss_V are not manually specified, the default values are as follows. emiss_K0 set to 1, emiss_nu set to 3 + q_emiss[k] - 1, and the diagonal of gamma_V (i.e., the variance) set to 3 + q_emiss[k] - 1 and the off-diagonal elements (i.e., the covariance) set to 0. In addition, when no manual values for the hyper-prior on the categorical emission distribution are specified at all (that is, the function prior_emiss_cat is not used), all elements of the matrices contained in emiss_mu0 are set to 0 in the function mHMM.

Note that in case covariates are specified, the hyper-prior parameter values of the inverse Wishart distribution on the covariance matrix remain unchanged, as the estimates of the regression coefficients for the covariates are fixed over subjects.

Value

prior_emiss_cat returns an object of class mHMM_prior_emiss, containing informative hyper-prior values for the categorical emission distribution(s) of the multilevel hidden Markov model. The object is specifically created and formatted for use by the function mHMM, and thoroughly checked for correct input dimensions. The object contains the following components:

gen: A list containing the elements m, n_dep, and q_emiss, used for checking equivalent general model properties specified under prior_emiss_cat and mHMM.
emiss_mu0: A list of lists containing the hypothesized hyper-prior mean values of the intercepts of the Multinomial logit model on the categorical emission probabilities.
emiss_K0: A list containing n_dep elements denoting the number of hypothetical prior subjects on which the set of hyper-prior mean intercepts specified in emiss_mu0 are based.
emiss_nu: A list containing n_dep elements denoting the degrees of freedom of the hyper-prior Inverse Wishart distribution on the covariance of the Multinomial logit intercepts.
emiss_V: A list containing n_dep elements containing the variance-covariance of the hyper-prior Inverse Wishart distribution on the covariance of the Multinomial logit intercepts.
n_xx_emiss: A numeric vector denoting the number of (level 2) covariates used to predict the emission distribution of each of the dependent variables. When no covariates are used, n_xx_emiss equals NULL.

Examples

###### Example using package example data, see ?nonverbal
# specifying general model properties:
m <- 3
n_dep <- 4
q_emiss <- c(3, 2, 3, 2)

# hypothesized mean emission probabilities
prior_prob_emiss_cat <- list(matrix(c(0.10, 0.80, 0.10,
                                      0.80, 0.10, 0.10,
                                      0.40, 0.40, 0.20), byrow = TRUE,
                                    nrow = m, ncol = q_emiss[1]), # vocalizing patient,
                                    # prior belief: state 1 - much talking, state 2 -
                                    # no talking, state 3 - mixed
                             matrix(c(0.30, 0.70,
                                      0.30, 0.70,
                                      0.30, 0.70), byrow = TRUE, nrow = m,
                                    ncol = q_emiss[2]), # looking patient
                                    # prior belief: all 3 states show frequent looking
                                    # behavior
                             matrix(c(0.80, 0.10, 0.10,
                                      0.10, 0.80, 0.10,
                                      0.40, 0.40, 0.20), byrow = TRUE,
                                    nrow = m, ncol = q_emiss[3]), # vocalizing therapist
                                    # prior belief: state 1 - no talking, state 2 -
                                    # frequent talking, state 3 - mixed
                             matrix(c(0.30, 0.70,
                                      0.30, 0.70,
                                      0.30, 0.70), byrow = TRUE, nrow = m,
                                    ncol = q_emiss[4])) # looking therapist
                                    # prior belief: all 3 states show frequent looking
                                    # behavior

# using the function prob_to_int to obtain intercept values for the above specified
# categorical emission distributions
prior_int_emiss <- sapply(prior_prob_emiss_cat, prob_to_int)
emiss_mu0 <- rep(list(vector(mode = "list", length = m)), n_dep)
for(k in 1:n_dep){
  for(i in 1:m){
  emiss_mu0[[k]][[i]] <- matrix(prior_int_emiss[[k]][i,], nrow = 1)
  }
}

emiss_K0 <- rep(list(c(1)), n_dep)
emiss_nu <- list(c(5), c(4), c(5), c(4))
emiss_V <- list(diag(5, q_emiss[1] - 1),
                diag(4, q_emiss[2] - 1),
                diag(5, q_emiss[3] - 1),
                diag(4, q_emiss[4] - 1))

manual_prior_emiss <- prior_emiss_cat(gen = list(m = m, n_dep = n_dep, q_emiss = q_emiss),
                                  emiss_mu0 = emiss_mu0, emiss_K0 = emiss_K0,
                                  emiss_nu = emiss_nu, emiss_V = emiss_V)


# using the informative hyper-prior in a model

# specifying starting values
start_TM <- diag(.7, m)
start_TM[lower.tri(start_TM) | upper.tri(start_TM)] <- .1
start_EM <- list(matrix(c(0.05, 0.90, 0.05,
                          0.90, 0.05, 0.05,
                          0.55, 0.45, 0.05), byrow = TRUE,
                        nrow = m, ncol = q_emiss[1]), # vocalizing patient
                 matrix(c(0.1, 0.9,
                          0.1, 0.9,
                          0.1, 0.9), byrow = TRUE, nrow = m,
                        ncol = q_emiss[2]), # looking patient
                 matrix(c(0.90, 0.05, 0.05,
                          0.05, 0.90, 0.05,
                          0.55, 0.45, 0.05), byrow = TRUE,
                        nrow = m, ncol = q_emiss[3]), # vocalizing therapist
                 matrix(c(0.1, 0.9,
                          0.1, 0.9,
                          0.1, 0.9), byrow = TRUE, nrow = m,
                        ncol = q_emiss[4])) # looking therapist

# Note that for reasons of running time, J is set at a ridiculous low value.
# One would typically use a number of iterations J of at least 1000,
# and a burn_in of 200.

out_3st_infemiss <- mHMM(s_data = nonverbal,
                    gen = list(m = m, n_dep = n_dep, q_emiss = q_emiss),
                    start_val = c(list(start_TM), start_EM),
                    emiss_hyp_prior = manual_prior_emiss,
                    mcmc = list(J = 11, burn_in = 5))

out_3st_infemiss
summary(out_3st_infemiss)

mHMMbayes documentation built on Aug. 8, 2025, 7:29 p.m.