hmm: Fitting Hidden Markov Models
In asbjornholk/HiddenStateModels: Fit or Create Hidden State Models

Description Usage Arguments Details Value See Also Examples

'hmm' is used to fit (or simply define) Hidden Markov Models. Most commonly, it will be used to simply fit a vector of observed emissions which come from a (known) common distribution family to a hidden Markov model, where the parameters of the underlying Markov chain as well as the emission distribution family is then fitted using the EM-algorithm.

Alternatively, one can use custom distribution families by providing various functions (see details).

Finally, one can also disable optimization altogether to simply obtain an 'hmm'-object which can then be assessed through various standard methods (see below).

1	hmm(x, Gamma, delta, dist = "custom", ..., estimate = !is.null(x))

`x`	Numeric vector of observed emissions. Can be 'NULL' if no estimation is desired.
`Gamma`	Initial value of transition probability matrix for underlying Markov chain.
`delta`	Numeric vector of probabilities of the initial distribution of the Markov chain.
`dist`	Distribution family of emissions. Can be one of 'poisson', 'normal', 'binom', 'exponential' or 'custom'. If 'custom', user must provide own functions for densities, MLEs and random generation (see details)
`...`	Parameters of emission distribution as well as parameters for EM-algorithm. See details.
`estimate`	Logical variable whether or not parameters of model (Gamma, delta and emission parameters) should be estimated by EM-algorithm. Defaults to 'TRUE' when data is available.

## Theory

This function is used to fit or define a hidden Markov model, i.e. a model where the distributions of X_1, ..., X_n (the emissions) depend on a hidden sequence Y_1, ..., Y_n (the hidden states). In particular, in this model, Y_1, ..., Y_n constitutes a Markov chain on the finite state space 1, ..., m, and the distribution of X_i depends only on the value of Y_i, i.e. X_i | Y_i=k ~ X_j | Y_j=k for all i, j.

As such, to estimate parameters in this model, one must estimate not only the parameters of the m emission distributions, but also the parameters of the Markov chain, i.e. the initial distribution delta (that is, Y_1 ~ delta) and the transition matrix Gamma (that is, P(Y_k=j | Y_(k-1)=i)=Gamma_i,j)

## Argument details

The 'dist' and '...' arguments determine the distribution of the emissions in the different states. If ‘dist' is one of ’poisson', 'normal', 'binom' or 'exponential', distribution parameters must be passed to the '...' argument. In particular they must have the following format:

If ‘dist' is ’custom', user must provide the following:

| | | |————–|———————————————————————————————————————————————————————————————————————————————————————————————————————————-| | 'lls' | Function or list of functions, where each function is on the form 'f(x, param)', and returns the conditional density of X_i given the parameter 'param'. | | 'params_lls' | List of parameters, one for each state. Must fit with 'lls' in the sense that 'lls(x, param[[i]])' (or 'lls[[i]](x, param[[i]])' if 'lls' is a list) denotes the density in point x of X_j given Y_j=i | | 'lls_mle' | Function or list of functions, which return the maximum likelihood estimates given the provided data 'x' and a vector (of equal length) of scalars 'u' where each scalar is between 0 and 1. That is, the function(s) must be on the form 'h(x, u)' and must return the value of 'param' maximizing 'sum(u * log(f(x, param)))'. | | 'rdist' | Function or list of functions that generate random emissions from the different hidden states. The function(s) must be on the form 'r(n, param)' and must return a vector of 'n' random realizations given the parameter 'param'. |

For each of 'lls', 'lls_mle' and 'rdist', if the distribution family itself (and not just the parameters) depends on the hidden state, a list of functions must be provided, one for each state.

Finally, '...' also takes arguments passed to the EM-algorithm, namely: 'epsilon', .

An object of class 'hmm'. The 'hmm'-class is equipped with a variety of default methods, see 'See also' section for details. An object of class 'hmm' is a list containing at least the following components:

If 'x' is not 'NULL', it will also include:

Finally, if estimation is performed, it will also include the following:

The 'hmm' object has methods for the following generic functions: [AIC][AIC.hmm], [BIC][BIC.hmm], [fitted.values][fitted.hmm], [logLik][logLik.hmm], [plot][plot.hmm], [print][print.hmm], [residuals][residuals.hmm], [simulate][simulate.hmm] and [summary][summary.hmm]. Some (most) of these are only available, when data is provided, i.e. when 'x' is not 'NULL'.

# Annual counts of earthquakes magnitude 7 or greater, 1900-2006.
# Source:
# Earthquake Data Base System of the U.S. Geological Survey, National
# Earthquake Information Center, Golden CO

quakes <- read.table("http://www.hmms-for-time-series.de/second/data/earthquakes.txt")$V2
Gamma <- rbind(c(0.9, 0.1), c(0.1, 0.9))
delta <- c(1, 1)/2
lambda <- c(10, 30)
hmm.EQ <- hmm(quakes, Gamma, delta, dist='poisson', lambda=lambda)
hmm.EQ

# If one does not want estimation by EM algorithm (e.g. for comparison of summary statistics), it can be disabled
hmm.EQ_no_opt <- hmm(quakes, Gamma, delta, dist='poisson', lambda=lambda, estimate=FALSE)
hmm.EQ_no_opt

# Creating 'empty' hmm object for sake of simulation (see simulate for further details)
# Here where all emission distributions are normal
Gamma <- rbind(c(0.5, 0.25, 0.25),
               c(0.1, 0.8 , 0.1),
               c(  0, 0.2 , 0.8))
delta <- c(1, 0, 0)
mean <- c(0, 5, 10)
sd <- rep(1, 3)

hmm.normal <- hmm(NULL, Gamma=Gamma, delta=delta, dist='normal', mean=mean, sd=sd)
hmm.normal

# Here, the emission distributions are custom (Uniform[0, theta])
Gamma <- rbind(c(0.5, 0.25, 0.25),
               c(0.1, 0.8 , 0.1),
               c(  0, 0.2 , 0.8))
delta <- c(1, 0, 0)
theta <- list(1, 5, 10)
lls <- function(x, param){dunif(x, 0, param)}
lls_mle <- function(x, u){max(x)}
rdist <- function(n, param){runif(n, 0, param)}
hmm.unif <- hmm(NULL, Gamma=Gamma, delta=delta, lls=lls, param_lls=theta, lls_mle=lls_mle, rdist=rdist)
hmm.unif

# Here, the emission distributions is either normal(0, 1) or exponential(1)
Gamma <- rbind(c(0.2, 0.8),
               c(0.8, 0.2))
delta <- c(1, 1)/2
param <- list(c(0, 1), 1)

lls <- list(function(x, param){dnorm(x, param[1], param[2])},
            function(x, param){dexp(x, param)})

lls_mle <- list(function(x, u){mean_hat <- sum(u*x) / sum(u); c(mean_hat, sqrt(sum(u*(x-mean_hat)^2) / sum(u)))},
                function(x, u){sum(u)/sum(u*x)})

rdist <- list(function(n, param){do.call(rnorm, args=as.list(c(n, param)))},
              function(n, param){rexp(n, param)})

hmm.mixture <- hmm(NULL, Gamma=Gamma, delta=delta, lls=lls, param_lls=param, lls_mle=lls_mle, rdist=rdist)
hmm.mixture