simulate_mhmm: Simulate Mixture Hidden Markov Models
In seqHMM: Mixture Hidden Markov Models for Social Sequence Data and Other Multivariate, Multichannel Categorical Time Series

simulate_mhmm

R Documentation

Simulate Mixture Hidden Markov Models

Description

Simulate sequences of observed and hidden states given the parameters of a mixture hidden Markov model.

Usage

simulate_mhmm(
  n_sequences,
  initial_probs,
  transition_probs,
  emission_probs,
  sequence_length,
  formula = NULL,
  data = NULL,
  coefficients = NULL
)

Arguments

`n_sequences`	The number of sequences to simulate.
`initial_probs`	A list containing vectors of initial state probabilities for the submodel of each cluster.
`transition_probs`	A list of matrices of transition probabilities for the submodel of each cluster.
`emission_probs`	A list which contains matrices of emission probabilities or a list of such objects (one for each channel) for the submodel of each cluster. Note that the matrices must have dimensions `s x m` where `s` is the number of hidden states and `m` is the number of unique symbols (observed states) in the data.
`sequence_length`	The length of the simulated sequences.
`formula`	Covariates as an object of class `formula()`, left side omitted.
`data`	An optional data frame, a list or an environment containing the variables in the model. If not found in data, the variables are taken from `environment(formula)`.
`coefficients`	An optional `k x l` matrix of regression coefficients for time-constant covariates for mixture probabilities, where `l` is the number of clusters and `k` is the number of covariates. A logit-link is used for mixture probabilities. The first column is set to zero.

Value

A list of state sequence objects of class stslist.

Examples

emission_probs_1 <- matrix(c(0.75, 0.05, 0.25, 0.95), 2, 2)
emission_probs_2 <- matrix(c(0.1, 0.8, 0.9, 0.2), 2, 2)
colnames(emission_probs_1) <- colnames(emission_probs_2) <-
  c("heads", "tails")

transition_probs_1 <- matrix(c(9, 0.1, 1, 9.9) / 10, 2, 2)
transition_probs_2 <- matrix(c(35, 1, 1, 35) / 36, 2, 2)
rownames(emission_probs_1) <- rownames(transition_probs_1) <-
  colnames(transition_probs_1) <- c("coin 1", "coin 2")
rownames(emission_probs_2) <- rownames(transition_probs_2) <-
  colnames(transition_probs_2) <- c("coin 3", "coin 4")

initial_probs_1 <- c(1, 0)
initial_probs_2 <- c(1, 0)

n <- 30
set.seed(123)
covariate_1 <- runif(n)
covariate_2 <- sample(c("A", "B"),
  size = n, replace = TRUE,
  prob = c(0.3, 0.7)
)
dataf <- data.frame(covariate_1, covariate_2)

coefs <- cbind(cluster_1 = c(0, 0, 0), cluster_2 = c(-1.5, 3, -0.7))
rownames(coefs) <- c("(Intercept)", "covariate_1", "covariate_2B")

sim <- simulate_mhmm(
  n = n, initial_probs = list(initial_probs_1, initial_probs_2),
  transition_probs = list(transition_probs_1, transition_probs_2),
  emission_probs = list(emission_probs_1, emission_probs_2),
  sequence_length = 20, formula = ~ covariate_1 + covariate_2,
  data = dataf, coefficients = coefs
)

stacked_sequence_plot(sim, 
  sort_by = "start", sort_channel = "states", type = "i"
)

hmm <- build_mhmm(sim$observations,
  initial_probs = list(initial_probs_1, initial_probs_2),
  transition_probs = list(transition_probs_1, transition_probs_2),
  emission_probs = list(emission_probs_1, emission_probs_2),
  formula = ~ covariate_1 + covariate_2,
  data = dataf
)

fit <- fit_model(hmm)
fit$model

paths <- hidden_paths(fit$model, as_stslist = TRUE)

stacked_sequence_plot(
  list(
    "estimated paths" = paths, 
    "true (simulated)" = sim$states
  ),
  sort_by = "start",
  sort_channel = "true (simulated)",
  type = "i"
)

seqHMM documentation built on June 8, 2025, 10:16 a.m.