vit_mHMM: Obtain hidden state sequence for each subject using the...

View source: R/vit_mHMM.R

vit_mHMMR Documentation

Obtain hidden state sequence for each subject using the Viterbi algorithm

Description

vit_mHMM obtains the most likely state sequence (for each subject) from an object of class mHMM (generated by the function mHMM()), using (an extended version of) the Viterbi algorithm. This is also known as global decoding. Optionally, the state probabilities themselves at each point in time can also be obtained.

Usage

vit_mHMM(object, s_data, burn_in = NULL, return_state_prob = FALSE)

Arguments

object

An object of class mHMM, generated by the function mHMM.

s_data

A matrix containing the observations to be modeled, where the rows represent the observations over time. In s_data, the first column indicates subject id number. Hence, the id number is repeated over rows equal to the number of observations for that subject. The subsequent columns contain the dependent variable(s). Note that in case of categorical dependent variable(s), input are integers (i.e., whole numbers) that range from 1 to q_emiss (see below) and is numeric (i.e., not be a (set of) factor variable(s)). The total number of rows are equal to the sum over the number of observations of each subject, and the number of columns are equal to the number of dependent variables (n_dep) + 1. The number of observations can vary over subjects.

burn_in

The number of iterations to be discarded from the MCMC algorithm when inferring the transition probability matrix gamma and the emission distribution of (each of) the dependent variable(s) for each subject from s_data. If omitted, defaults to NULL and burn_in specified at the function mHMM will be used.

return_state_prob

A logical scaler. Should the function, in addition to the most likely state sequence for each subject, also return the state probabilities at each point in time for each subject (return_state_prob = TRUE) or not (return_state_prob = FALSE). Defaults to return_state_prob = FALSE.

Details

Note that local decoding is also possible, by inferring the most frequent state at each point in time for each subject from the sampled state path at each iteration of the MCMC algorithm. This information is contained in the output object return_path of the function mHMM().

Value

The function vit_mHMM returns a matrix containing the most likely state at each point in time. The first column indicates subject id number. Hence, the id number is repeated over rows equal to the number of observations for that subject. The second column contains the most likely state at each point in time. If requested, the subsequent columns contain the state probabilities at each point in time for each subject.

References

\insertRef

viterbi1967mHMMbayes

\insertRef

rabiner1989mHMMbayes

See Also

mHMM for analyzing multilevel hidden Markov data and obtaining the input needed for vit_mHMM, and sim_mHMM for simulating multilevel hidden Markov data.

Examples

###### Example on package example categorical data, see ?nonverbal
# First fit the multilevel HMM on the example data

# specifying general model properties:
m <- 2
n_dep <- 4
q_emiss <- c(3, 2, 3, 2)

# specifying starting values
start_TM <- diag(.8, m)
start_TM[lower.tri(start_TM) | upper.tri(start_TM)] <- .2
start_EM <- list(matrix(c(0.05, 0.90, 0.05, 0.90, 0.05, 0.05), byrow = TRUE,
                        nrow = m, ncol = q_emiss[1]), # vocalizing patient
                 matrix(c(0.1, 0.9, 0.1, 0.9), byrow = TRUE, nrow = m,
                        ncol = q_emiss[2]), # looking patient
                 matrix(c(0.90, 0.05, 0.05, 0.05, 0.90, 0.05), byrow = TRUE,
                        nrow = m, ncol = q_emiss[3]), # vocalizing therapist
                 matrix(c(0.1, 0.9, 0.1, 0.9), byrow = TRUE, nrow = m,
                        ncol = q_emiss[4])) # looking therapist

# Fit the multilevel HMM model:
# Note that for reasons of running time, J is set at a ridiculous low value.
# One would typically use a number of iterations J of at least 1000,
# and a burn_in of 200.
out_2st <- mHMM(s_data = nonverbal, gen = list(m = m, n_dep = n_dep,
                q_emiss = q_emiss), start_val = c(list(start_TM), start_EM),
                mcmc = list(J = 3, burn_in = 1))

###### obtain the most likely state sequence with the Viterbi algorithm
states1 <- vit_mHMM(s_data = nonverbal, object = out_2st)


###### Example on simulated multivariate continuous data
# simulating multivariate continuous data
n_t     <- 100
n       <- 10
m       <- 3
n_dep   <- 2

gamma   <- matrix(c(0.8, 0.1, 0.1,
                    0.2, 0.7, 0.1,
                    0.2, 0.2, 0.6), ncol = m, byrow = TRUE)

emiss_distr <- list(matrix(c( 50, 10,
                              100, 10,
                              150, 10), nrow = m, byrow = TRUE),
                    matrix(c(5, 2,
                             10, 5,
                             20, 3), nrow = m, byrow = TRUE))

data_cont <- sim_mHMM(n_t = n_t, n = n, gen = list(m = m, n_dep = n_dep), data_distr = 'continuous',
                  gamma = gamma, emiss_distr = emiss_distr, var_gamma = .1, var_emiss = c(.5, 0.01))

# Specify hyper-prior for the continuous emission distribution
manual_prior_emiss <- prior_emiss_cont(
                        gen = list(m = m, n_dep = n_dep),
                        emiss_mu0 = list(matrix(c(30, 70, 170), nrow = 1),
                                         matrix(c(7, 8, 18), nrow = 1)),
                        emiss_K0 = list(1, 1),
                        emiss_V =  list(rep(100, m), rep(25, m)),
                        emiss_nu = list(1, 1),
                        emiss_a0 = list(rep(1, m), rep(1, m)),
                        emiss_b0 = list(rep(1, m), rep(1, m)))

# Run the model on the simulated data:
# Note that for reasons of running time, J is set at a ridiculous low value.
# One would typically use a number of iterations J of at least 1000,
# and a burn_in of 200.
out_3st_cont_sim <- mHMM(s_data = data_cont$obs,
                    data_distr = "continuous",
                    gen = list(m = m, n_dep = n_dep),
                    start_val = c(list(gamma), emiss_distr),
                    emiss_hyp_prior = manual_prior_emiss,
                    mcmc = list(J = 11, burn_in = 5))

###### obtain the most likely state sequence with the Viterbi algorithm,
# including state probabilities at each time point.
states2 <- vit_mHMM(s_data = data_cont$obs, object = out_3st_cont_sim,
                    return_state_prob = TRUE)




mHMMbayes documentation built on Oct. 2, 2023, 5:06 p.m.