HMM | R Documentation |
Encapsulates the observation and hidden state models for a hidden Markov model.
new()
Create new HMM object
HMM$new(obs = NULL, hid = NULL, file = NULL, init = NULL, fixpar = NULL)
obs
Observation object, created with Observation$new()
.
This contains the formulation for the observation model.
hid
MarkovChain object, created with MarkovChain$new()
.
This contains the formulation for the state process model.
file
Path to specification file for HMM. If this argument is
used, then obs
and hid
are unnecessary.
init
HMM object, used to initialise the parameters for this model.
If init
is passed, then all parameters that are included in init
and in the present model are copied. This may be useful when fitting
increasingly complex models: start from a simple model, then pass it as
init to create a more complex model, and so on.
fixpar
Named list, with optional elements: 'hid', 'obs', 'delta0',
'lambda_obs', and 'lambda_hid'. Each element is a named vector of
parameters in coeff_fe that should either be fixed (if the corresponding
element is set to NA) or estimated to a common value (using integers or
factor levels).don See examples in the vignettes, and check the TMB
documentation to understand the inner workings (argument map
of TMB::MakeADFun()
).
A new HMM object
# Load data set (included with R) data(nottem) data <- data.frame(temp = as.vector(t(nottem))) # Create hidden state and observation models hid <- MarkovChain$new(data = data, n_states = 2) par0 <- list(temp = list(mean = c(40, 60), sd = c(5, 5))) obs <- Observation$new(data = data, n_states = 2, dists = list(temp = "norm"), par = par0) # Create HMM hmm <- HMM$new(hid = hid, obs = obs)
obs()
Observation object for this model
HMM$obs()
hid()
MarkovChain object for this model
HMM$hid()
out()
Output of optimiser after model fitting
HMM$out()
tmb_obj()
Model object created by TMB. This is the output of
the TMB function MakeADFun
, and it is a list including elements
fn
Objective function
gr
Gradient function of fn
par
Vector of initial parameters on working scale
HMM$tmb_obj()
tmb_obj_joint()
Model object created by TMB for the joint likelihood of
the fixed and random effects. This is the output of the TMB function
MakeADFun
, and it is a list including elements
fn
Objective function
gr
Gradient function of fn
par
Vector of initial parameters on working scale
HMM$tmb_obj_joint()
tmb_rep()
Output of the TMB function sdreport
, which includes
estimates and standard errors for all model parameters.
HMM$tmb_rep()
states()
Vector of estimated states, after viterbi
has
been run
HMM$states()
coeff_fe()
Coefficients for fixed effect parameters
HMM$coeff_fe()
coeff_re()
Coefficients for random effect parameters
HMM$coeff_re()
coeff_list()
List of all model coefficients
These are the parameters estimated by the model, including fixed and random effect parameters for the observation parameters and the transition probabilities, (transformed) initial probabilities, and smoothness parameters.
HMM$coeff_list()
fixpar()
Fixed parameters
HMM$fixpar(all = FALSE)
all
Logical. If FALSE, only user-specified fixed parameters are returned, but not parameters that are fixed by definition (e.g., size of binomial distribution).
coeff_array()
Array of working parameters
HMM$coeff_array()
lambda()
Smoothness parameters
HMM$lambda()
update_par()
Update parameters stored inside model object
HMM$update_par(par_list = NULL, iter = NULL)
par_list
List with elements for coeff_fe_obs, coeff_fe_hid, coeff_re_obs, coeff_re_hid, log_delta0, log_lambda_hid, and log_lambda_obs
iter
Optional argument to update model parameters based on MCMC iterations (if using rstan). Either the index of the iteration to use, or "mean" if the posterior mean should be used.
sd_re()
Standard deviation of smooth terms (or random effects)
This function transforms the smoothness parameter of each smooth term into a standard deviation, given by SD = 1/sqrt(lambda). It is particularly helpful to get the standard deviations of independent normal random effects.
HMM$sd_re()
List of standard deviations for observation model and hidden state model.
par()
Model parameters
HMM$par(t = 1)
t
returns parameters at time t, default is t = 1
A list with elements:
obspar
Parameters of observation model
tpm
Transition probability matrix of hidden state model
set_priors()
Set priors for coefficients
HMM$set_priors(new_priors = NULL)
new_priors
is a named list of matrices with optional elements coeff_fe_obs, coeff_fe_hid, log_lambda_obs, andlog_lambda_hid. Each matrix has two columns (first col = mean, second col = sd) specifying parameters for normal priors.
priors()
Extract stored priors
HMM$priors()
iters()
Iterations from stan MCMC fit
HMM$iters(type = "response")
type
Either "response" for parameters on the response (natural) scale, or "raw" for parameters on the linear predictor scale.
see output of as.matrix in stan
out_stan()
fitted stan object from MCMC fit
HMM$out_stan()
the stanfit object
llk()
Log-likelihood at current parameters
HMM$llk()
Log-likelihood
edf()
Effective degrees of freedom
HMM$edf()
Number of effective degrees of freedom (accounting for flexibility in non-parametric terms implied by smoothing)
suggest_initial()
Suggest initial parameter values
Uses K-means clustering to split the data into naive "states", and estimates observation parameters within each of these states. This is meant to help with selecting initial parameter values before model fitting, but users should still think about the values carefully, and try multiple set of initial parameter values to ensure convergence to the global maximum of the likelihood function.
HMM$suggest_initial()
List of initial parameters
setup()
TMB setup
This creates an attribute tmb_obj
, which can be used to
evaluate the negative log-likelihood function.
HMM$setup(silent = TRUE)
silent
Logical. If TRUE, all tracing outputs are hidden (default).
fit_stan()
Fit model using tmbstan
Consult documentation of the tmbstan package for more information.
After this method has been called, the Stan output can be accessed
using the method out_stan()
. This Stan output can for example
be visualised using functions from the rstan package. The parameters
stored in this HMM object are automatically updated to the mean
posterior estimate, although this can be changed using
update_par()
.
HMM$fit_stan(..., silent = FALSE)
...
Arguments passed to tmbstan
silent
Logical. If FALSE, all tracing outputs are shown (default).
fit()
Model fitting
The negative log-likelihood of the model is minimised using the
function nlminb()
. TMB uses the Laplace approximation to integrate
the random effects out of the likelihood.
After the model has been fitted, the output of nlminb()
can be
accessed using the method out()
. The estimated parameters can
be accessed using the methods par()
(for the HMM parameters,
possibly dependent on covariates), predict()
(for uncertainty
quantification and prediction of the HMM parameters for new covariate
values), coeff_fe()
(for estimated fixed effect coefficients on
the linear predictor scale), and coeff_re()
(for estimated random
effect coefficients on the linear predictor scale).
HMM$fit(silent = FALSE, ...)
silent
Logical. If FALSE, all tracing outputs are shown (default).
...
Other arguments to nlminb which is used to optimise the
likelihood. This currently only supports the additional argument
control
, which is a list of control parameters such as
eval.max
and iter.max
(see ?nlminb
)
# Load data set (included with R) data(nottem) data <- data.frame(temp = as.vector(t(nottem))) # Create hidden state and observation models hid <- MarkovChain$new(data = data, n_states = 2) par0 <- list(temp = list(mean = c(40, 60), sd = c(5, 5))) obs <- Observation$new(data = data, n_states = 2, dists = list(temp = "norm"), par = par0) # Create HMM hmm <- HMM$new(hid = hid, obs = obs) # Fit HMM hmm$fit(silent = TRUE)
mle()
Get maximum likelihood estimates once model fitted
HMM$mle()
list of maximum likelihood estimates as described as input for the function update_par()
forward_backward()
Forward-backward algorithm
The forward probability for time step t and state j
is the joint pdf/pmf of observations up to time t and of being in
state j at time t, p(Z[1], Z[2], ..., Z[t], S[t] = j).
The backward probability for time t and state j is the
conditional pdf/pmf of observations between time t + 1 and n,
given state j at time t, p(Z[t+1], Z[t+2], ..., Z[n] | S[t] = j).
This function returns their logarithm, for use in other methods
state_probs
, and sample_states
.
HMM$forward_backward()
Log-forward and log-backward probabilities
pseudores()
Pseudo-residuals
Compute pseudo-residuals for the fitted model. If the fitted model is the "true" model, the pseudo-residuals follow a standard normal distribution. Deviations from normality suggest lack of fit.
HMM$pseudores()
List (of length the number of variables), where each element is a vector of pseudo-residuals (of length the number of data points)
viterbi()
Viterbi algorithm
HMM$viterbi()
Most likely state sequence
sample_states()
Sample posterior state sequences using forward-filtering backward-sampling
The forward-filtering backward-sampling algorithm returns a sequence of states, similarly to the Viterbi algorithm, but it generates it from the posterior distribution of state sequences, i.e., accounting for uncertainty in the state classification. Multiple generated sequences will therefore generally not be the same.
HMM$sample_states(nsamp = 1, full = FALSE)
nsamp
Number of samples to produce
full
If TRUE and model fit by fit_stan then parameter estimates are sampled from the posterior samples before simulating each sequence
Matrix where each column is a different sample of state sequences, and each row is a time of observation
state_probs()
Compute posterior probability of being in each state
HMM$state_probs()
matrix with a row for each observation and a column for each state
post_coeff()
Posterior sampling for model coefficients
HMM$post_coeff(n_post)
n_post
Number of posterior samples
Matrix with one column for each coefficient and one row for each posterior draw
post_linpred()
Posterior sampling for linear predictor
HMM$post_linpred(n_post)
n_post
Number of posterior samples
List with elements obs and hid, where each is a matrix with one column for each predictor and one row for each posterior draw
post_fn()
Create posterior simulations of a function of a model component
HMM$post_fn(fn, n_post, comp = NULL, ..., level = 0, return_post = FALSE)
fn
Function which takes a vector of linear predictors as input and produces either a scalar or vector output
n_post
Number of posterior simulations
comp
Either "obs" for observation model linear predictor, or "hid" for hidden model linear predictor
...
Arguments passed to fn
level
Confidence interval level if required (e.g., 0.95 for 95 confidence intervals). Default is 0, i.e., confidence intervals are not returned.
return_post
Logical indicating whether to return the posterior samples. If FALSE (default), only mean estimates and confidence intervals are returned
A list with elements:
If return_post = TRUE, this is a vector (for scalar outputs of fn) or matrix (for vector outputs) with a column for each simulation
Mean over posterior samples
Lower confidence interval bound (if level !=0)
Upper confidence interval bound (if level !=0)
predict()
Predict estimates from a fitted model
By default, this returns point estimates of the HMM parameters for a new data frame of covariates. See the argument 'n_post' to also get confidence intervals.
HMM$predict( what, t = 1, newdata = NULL, n_post = 0, level = 0.95, return_post = FALSE, as_list = TRUE )
what
Which estimates to predict? Options include transition probability matrices "tpm", stationary distributions "delta", or observation distribution parameters "obspar"
t
Time points to predict at
newdata
New dataframe to use for prediction
n_post
If greater than zero then n_post posterior samples are produced, and used to create confidence intervals.
level
Level of the confidence intervals, e.g. CI = 0.95 will produce 95% confidence intervals (default)
return_post
Logical. If TRUE, a list of posterior samples is returned.
as_list
Logical. If confidence intervals are required for the
transition probabilities or observation parameters, this
argument determines whether the MLE, lower confidence limit and upper
confidence limit are returned as separate elements in a list (if
TRUE; default), or whether they are combined into a single array (if
FALSE). Ignored if what = "delta"
or if n_post = 0
.
...
Other arguments to the respective functions for hid$tpm, hid$delta, obs$par
Maximum likelihood estimates (mle
) of predictions,
and confidence limits (lcl
and ucl
) if requested. The
format of the output depends on whether confidence intervals are
required (specified through n_post
), and on the argument
as_list
.
# Load data set (included with R) data(nottem) data <- data.frame(temp = as.vector(t(nottem))) # Create hidden state and observation models hid <- MarkovChain$new(data = data, n_states = 2) par0 <- list(temp = list(mean = c(40, 60), sd = c(5, 5))) obs <- Observation$new(data = data, n_states = 2, dists = list(temp = "norm"), par = par0) # Create HMM hmm <- HMM$new(hid = hid, obs = obs) # Fit HMM hmm$fit(silent = TRUE) # Get transition probability matrix with confidence intervals hmm$predict(what = "tpm", n_post = 1000)
confint()
Confidence intervals for working parameters
This function computes standard errors for all fixed effect model parameters based on the diagonal of the inverse of the Hessian matrix, and then derives Wald-type confidence intervals.
HMM$confint(level = 0.95)
level
Level of confidence intervals. Defaults to 0.95, i.e., 95% confidence intervals.
List of matrices with three columns: mle (maximum likelihood estimate), lcl (lower confidence limit), ucl (upper confidence limit), and se (standard error). One such matrix is produced for the working parameters of the observation model, the working parameters of the hidden state model, the smoothness parameters of the observation model, and the smoothness parameters of the hidden state model.
simulate()
Simulate from hidden Markov model
HMM$simulate(n, data = NULL, silent = FALSE)
n
Number of time steps to simulate
data
Optional data frame including covariates
silent
if TRUE then no messages are printed
Data frame including columns of data (if provided), and simulated data variables
check()
Compute goodness-of-fit statistics using simulation
Many time series are simulated from the fitted model, and the statistic(s) of interest are calculated for each. A histogram of those values can for example be used to compare to the observed value of the statistic. An observation far in the tails of the distribution of simulated statistics suggests lack of fit.
HMM$check(check_fn, nsims = 100, full = FALSE, silent = FALSE)
check_fn
Goodness-of-fit function which accepts "data" as input and returns a statistic (either a vector or a single number) to be compared between observed data and simulations.
nsims
Number of simulations to perform
full
If model fitted with 'fit_stan', then full = TRUE will sample from posterior for each simulation
silent
Logical. If FALSE, simulation progress is shown. (Default: TRUE)
List with elements:
Vector of values of goodness-of-fit statistics for the observed data
Matrix of values of goodness-of-fit statistics for the simulated data sets (one row for each statistic, and one column for each simulation)
ggplot object
plot_ts()
Time series plot coloured by states
Creates a plot of the data coloured by the most likely state sequence, as estimated by the Viterbi algorithm. If one variable name is passed as input, it is plotted against time. If two variables are passed, they are plotted against each other.
HMM$plot_ts(var, var2 = NULL, line = TRUE)
var
Name of the variable to plot.
var2
Optional name of a second variable, for 2-d plot.
line
Logical. If TRUE (default), lines are drawn between successive data points. Can be set to FALSE if another geom is needed (e.g., geom_point).
A ggplot object
plot_dist()
Plot observation distributions weighted by frequency in Viterbi
This is a wrapper around Observation$plot_dist, where the distribution for each state is weighted by the proportion of time spent in that state (according to the Viterbi state sequence).
HMM$plot_dist(var = NULL)
var
Name of data variable. If NULL, a list of plots are returned (one for each observation variable)
Plot of distributions with data histogram
plot()
Plot a model component
HMM$plot( what, var = NULL, covs = NULL, i = NULL, j = NULL, n_grid = 50, n_post = 1000 )
what
Name of model component to plot: should be one of "tpm" (transition probabilities), "delta" (stationary state probabilities), or "obspar" (state-dependent observation parameters)
var
Name of covariate to plot on x-axis
covs
Optional named list for values of covariates (other than 'var') that should be used in the plot (or dataframe with single row). If this is not specified, the mean value is used for numeric variables, and the first level for factor variables.
i
If plotting tpm then rows of tpm; if plotting delta then indices of states to plot; if plotting obspar then full names of parameters to plot (e.g., obsvar.mean)
j
If plotting tpm then columnss of tpm to plot; if plotting delta then this is ignored,; if plotting obspar then indices of states to plot
n_grid
Number of points in grid over x-axis (default: 50)
n_post
Number of posterior simulations to use when computing
confidence intervals; default: 1000. See predict
function for
more detail.
A ggplot object
AIC_marginal()
Marginal Akaike Information Criterion
The marginal AIC is for example defined by Wood (2017), as AIC = - 2L + 2k where L is the maximum marginal log-likelihood (of fixed effects), and k is the number of degrees of freedom of the fixed effect component of the model
HMM$AIC_marginal()
Marginal AIC
AIC_conditional()
Conditional Akaike Information Criterion
The conditional AIC is for example defined by Wood (2017), as AIC = - 2L + 2k where L is the maximum joint log-likelihood (of fixed and random effects), and k is the number of effective degrees of freedom of the model (accounting for flexibility in non-parametric terms implied by smoothing)
HMM$AIC_conditional()
Conditional AIC
print_obspar()
Print observation parameters at t = 1
HMM$print_obspar()
print_tpm()
Print observation parameters at t = 1
HMM$print_tpm()
formulation()
Print model formulation
HMM$formulation()
print()
Print HMM object
HMM$print()
clone()
The objects of this class are cloneable with this method.
HMM$clone(deep = FALSE)
deep
Whether to make a deep clone.
## ------------------------------------------------
## Method `HMM$new`
## ------------------------------------------------
# Load data set (included with R)
data(nottem)
data <- data.frame(temp = as.vector(t(nottem)))
# Create hidden state and observation models
hid <- MarkovChain$new(data = data, n_states = 2)
par0 <- list(temp = list(mean = c(40, 60), sd = c(5, 5)))
obs <- Observation$new(data = data, n_states = 2,
dists = list(temp = "norm"),
par = par0)
# Create HMM
hmm <- HMM$new(hid = hid, obs = obs)
## ------------------------------------------------
## Method `HMM$fit`
## ------------------------------------------------
# Load data set (included with R)
data(nottem)
data <- data.frame(temp = as.vector(t(nottem)))
# Create hidden state and observation models
hid <- MarkovChain$new(data = data, n_states = 2)
par0 <- list(temp = list(mean = c(40, 60), sd = c(5, 5)))
obs <- Observation$new(data = data, n_states = 2,
dists = list(temp = "norm"),
par = par0)
# Create HMM
hmm <- HMM$new(hid = hid, obs = obs)
# Fit HMM
hmm$fit(silent = TRUE)
## ------------------------------------------------
## Method `HMM$predict`
## ------------------------------------------------
# Load data set (included with R)
data(nottem)
data <- data.frame(temp = as.vector(t(nottem)))
# Create hidden state and observation models
hid <- MarkovChain$new(data = data, n_states = 2)
par0 <- list(temp = list(mean = c(40, 60), sd = c(5, 5)))
obs <- Observation$new(data = data, n_states = 2,
dists = list(temp = "norm"),
par = par0)
# Create HMM
hmm <- HMM$new(hid = hid, obs = obs)
# Fit HMM
hmm$fit(silent = TRUE)
# Get transition probability matrix with confidence intervals
hmm$predict(what = "tpm", n_post = 1000)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.