View source: R/initial_parameter_training.R
initial_parameter_training | R Documentation |
The function computes plausible starting values for both the Baum-Welch algorithm and the algorithm for directly maximizing the log-Likelihood. Plausible starting values can potentially diminish problems of (i) numerical instability and (ii) not finding the global optimum.
initial_parameter_training(
x,
m,
distribution_class,
n = 100,
discr_logL = FALSE,
discr_logL_eps = 0.5
)
x |
a vector object containing the time-series of observations that are assumed to be realizations of the (hidden Markov state dependent) observation process of the model. |
m |
a (finite) number of states in the hidden Markov chain. |
distribution_class |
a single character string object with the abbreviated name of
the $m$ observation distributions of the Markov dependent observation process.
The following distributions are supported: Poisson ( |
n |
a single numerical value specifying the number of samples to find the best
starting value for the training algorithm. Default value is |
discr_logL |
a logical object. Default is |
discr_logL_eps |
a single numerical value, used to approximate the discrete
log-likelihood for a hidden Markov model based on nomal distributions
(for |
From our experience, parameter estimation for long time-series of observations
(T>1000
) or observation values >1500
tend to be numerical unstable and
does not necessarily find a global maximum. Both problems can eventually be
diminished with plausible starting values. Basically, the idea behind
initial_parameter_training
is to sample randomly n
sets of m
observations from the time-series x
, as means (E
) of the state-dependent
distributions. This n
samplings of E
, therefore induce n
sets of
parameters (distribution_theta
) for the HMM without running a (slow) parameter
estimation algorithm. Furthermore, initial_parameter_training
calculates the
log-Likelihood for all those n
sets of parameters. The set of parameters with
the best Likelihood are outputted as plausible starting values.
(Additionally to the n
sets of randomly chosen observations as means, the
m
quantiles of the observations are also checked as plausible means within
this algorithm.)
The function initial_parameter_training
returns a list containing the
following components:
input number of states in the hidden Markov chain.
a single numerical value representing the number of parameters of the defined distribution class of the observation process.
logarithmized likelihood of the model evaluated at the HMM with given
starting values (delta, gamma, distribution theta
) induced by E
.
randomly choosen means of the observation time-series x
, used for the
observation distributions, for which the induced parameters
(delta, gamma, distribution theta
) produce the largest Likelihood.
a list object containing the plausible starting values for
the parameters of the m
observation distributions that are dependent on
the hidden Markov state.
a vector object containing plausible starting values for the marginal
probability distribution of the m
states of the Markov chain at the time
point t=1
. At the moment:
delta = rep(1/m, times=m)
.
a matrix (nrow=ncol=m
) containing the plausible starting values
for the transition matrix of the hidden Markov chain. At the moment:
gamma = 0.8 * diag(m) + rep(0.2/m, times=m)
.
Vitali Witowski (2013).
Baum_Welch_algorithm
, direct_numerical_maximization
,
HMM_training
x <- c(1,16,19,34,22,6,3,5,6,3,4,1,4,3,5,7,9,8,11,11,
14,16,13,11,11,10,12,19,23,25,24,23,20,21,22,22,18,7,
5,3,4,3,2,3,4,5,4,2,1,3,4,5,4,5,3,5,6,4,3,6,4,8,9,12,
9,14,17,15,25,23,25,35,29,36,34,36,29,41,42,39,40,43,
37,36,20,20,21,22,23,26,27,28,25,28,24,21,25,21,20,21,
11,18,19,20,21,13,19,18,20,7,18,8,15,17,16,13,10,4,9,
7,8,10,9,11,9,11,10,12,12,5,13,4,6,6,13,8,9,10,13,13,
11,10,5,3,3,4,9,6,8,3,5,3,2,2,1,3,5,11,2,3,5,6,9,8,5,
2,5,3,4,6,4,8,15,12,16,20,18,23,18,19,24,23,24,21,26,
36,38,37,39,45,42,41,37,38,38,35,37,35,31,32,30,20,39,
40,33,32,35,34,36,34,32,33,27,28,25,22,17,18,16,10,9,
5,12,7,8,8,9,19,21,24,20,23,19,17,18,17,22,11,12,3,9,
10,4,5,13,3,5,6,3,5,4,2,5,1,2,4,4,3,2,1)
# Finding plausibel starting values for the parameter estimation
# for a generealized-Pois-HMM with m=4 states
m <- 4
plausible_starting_values <-
initial_parameter_training(x = x,
m = m,
distribution_class = "genpois",
n = 100)
print(plausible_starting_values)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.