estimate_mnhmm: Estimate a Mixture Non-homogeneous Hidden Markov Model

View source: R/estimate_mnhmm.R

estimate_mnhmmR Documentation

Estimate a Mixture Non-homogeneous Hidden Markov Model

Description

Function estimate_mnhmm estimates a mixture version of non-homogeneous hidden Markov model (MNHMM) where initial, transition, emission, and mixture probabilities can depend on covariates. See estimate_nhmm() for further details.

Usage

estimate_mnhmm(
  n_states,
  n_clusters,
  emission_formula,
  initial_formula = ~1,
  transition_formula = ~1,
  cluster_formula = ~1,
  data,
  time,
  id,
  lambda = 0,
  prior_obs = "fixed",
  state_names = NULL,
  cluster_names = NULL,
  inits = "random",
  init_sd = 2,
  restarts = 0L,
  method = "EM-DNM",
  bound = Inf,
  control_restart = list(),
  control_mstep = list(),
  ...
)

Arguments

n_states

An integer > 1 defining the number of hidden states.

n_clusters

A positive integer defining the number of clusters (mixtures).

emission_formula

of class formula() for the state emission probabilities, or a list of such formulas in case of multiple response variables. The left-hand side of formulas define the responses. For multiple responses having same formula, you can use a form c(y1, y2) ~ x, where y1 and y2 are the response variables.

initial_formula

of class formula() for the initial state probabilities. Left-hand side of the formula should be empty.

transition_formula

of class formula() for the state transition probabilities. Left-hand side of the formula should be empty.

cluster_formula

of class formula() for the mixture probabilities.

data

A data frame containing the variables used in the model formulas.

time

Name of the time index variable in data.

id

Name of the id variable in data identifying different sequences.

lambda

Penalization factor lambda for penalized log-likelihood, where the penalization is 0.5 * lambda * sum(eta^2). Note that with method = "L-BFGS" both objective function (log-likelihood) and the penalization term is scaled with number of non-missing observations. Default is 0, but small values such as 1e-4 can help to ensure numerical stability of L-BFGS by avoiding extreme probabilities. See also argument bound for hard constraints.

prior_obs

Either "fixed" or a list of vectors given the prior distributions for the responses at time "zero". See details.

state_names

A vector of optional labels for the hidden states. If this is NULL (the default), numbered states are used.

cluster_names

A vector of optional labels for the clusters. If this is NULL (the default), numbered clusters are used.

inits

If inits = "random" (default), random initial values are used. Otherwise inits should be list of initial values. If coefficients are given using list components eta_pi, eta_A, eta_B, and eta_omega, these are used as is, alternatively initial values can be given in terms of the initial state, transition, emission, and mixture probabilities using list components initial_probs, emission_probs, transition_probs, and cluster_probs. These can also be mixed, i.e. you can give only initial_probs and eta_A.

init_sd

Standard deviation of the normal distribution used to generate random initial values. Default is 2. If you want to fix the initial values of the regression coefficients to zero, use init_sd = 0.

restarts

Number of times to run optimization using random starting values (in addition to the final run). Default is 0.

method

Optimization method used. Option "EM" uses EM algorithm with L-BFGS in the M-step. Option "DNM" uses direct maximization of the log-likelihood, by default using L-BFGS. Option "EM-DNM" (the default) runs first a maximum of 10 iterations of EM and then switches to L-BFGS (but other algorithms of NLopt can be used).

bound

Positive value defining the hard lower and upper bounds for the working parameters \eta, which are used to avoid extreme probabilities and corresponding numerical issues especially in the M-step of EM algorithm. Default is ⁠Inf´, i.e., no bounds. Note that he bounds are not enforced for M-step in intercept-only case with ⁠lambda = 0'.

control_restart

Controls for restart steps, see details.

control_mstep

Controls for M-step of EM algorithm, see details.

...

Additional arguments to nloptr::nloptr() and EM algorithm. See details.

Value

Object of class mnhmm.

See Also

estimate_nhmm() for further details.

Examples

data("mvad", package = "TraMineR")

d <- reshape(mvad, direction = "long", varying = list(15:86), 
  v.names = "activity")

## Not run: 
set.seed(1)
fit <- estimate_mnhmm(n_states = 3, n_clusters = 2,
  data = d, time = "time", id = "id", 
  cluster_formula = ~ male + catholic + gcse5eq + Grammar + 
    funemp + fmpr + livboth + Belfast +
  N.Eastern + Southern + S.Eastern + Western,
  emission_formula = activity ~ male + catholic + gcse5eq,
  initial_formula = ~ 1, 
  transition_formula = ~ male + gcse5eq
  )

## End(Not run)

seqHMM documentation built on June 8, 2025, 10:16 a.m.