HMM_simulation: Generating Realizations of a Hidden Markov Model

Description Usage Arguments Value Note Author(s) See Also Examples

Description

This function generates a sequence of hidden states of a Markov chain and a corresponding parallel sequence of observations.

Usage

1
2
3
4
5
HMM_simulation(size, m, delta = rep(1 / m, times = m), 
               gamma = 0.8 * diag(m) + rep(0.2 / m, times = m), 
               distribution_class, distribution_theta, 
               obs_range = c(NA, NA), obs_round = FALSE, 
               obs_non_neg = FALSE, plotting = 0)

Arguments

size

length of the time-series of hidden states and observations (also T).

m

a (finite) number of states in the hidden Markov chain.

delta

a vector object containing starting values for the marginal probability distribution of the m states of the Markov chain at the time point t=1. Default is delta=rep(1/m,times=m).

gamma

a matrix (nrow=ncol=m) containing starting values for the transition matrix of the hidden Markov chain.

Default is gamma=0.8 * diag(m) + rep(0.2 / m, times = m).

distribution_class

a single character string object with the abbreviated name of the m observation distributions of the Markov dependent observation process. The following distributions are supported by this algorithm: Poisson (pois); generalized Poisson (genpois); normal (norm, discrete log-Likelihood not applicable by this algorithm); geometric (geom).

distribution_theta

a list object containing starting values for the parameters of the m observation distributions that are dependent on the hidden Markov state.

obs_range

a vector object specifying the range for the observations to be generated. For instance, the vector c(0,1500) allows only observations between 0 and 1500 to be generated by the HMM. Default value is FALSE. See Notes for further details.

obs_round

a logical object. TRUE if all generated observations are natural. Default value is FALSE. See Notes for further details.

obs_non_neg

a logical object. TRUE, if non negative observations are generated. Default value is FALSE. See Notes for further details.

plotting

a numeric value between 0 and 5 (generates different outputs). NA suppresses graphical output. Default value is 0.
0: output 1-5
1: summary of all results
2: generated time series of states of the hidden Markov chain
3: means (of the observation distributions, which depend on the states of the Markov chain) along the time series of states of the hidden Markov chain
4: observations along the time series of states of the hidden Markov chain
5: simulated observations

Value

The function HMM_simulation returns a list containing the following components:

size

length of the generated time-series of hidden states and observations.

m

input number of states in the hidden Markov chain.

delta

a vector object containing the chosen values for the marginal probability distribution of the m states of the Markov chain at the time point t=1.

gamma

a matrix containing the chosen values for the transition matrix of the hidden Markov chain.

distribution_class

a single character string object with the abbreviated name of the chosen observation distributions of the Markov dependent observation process.

distribution_theta

a list object containing the chosen values for the parameters of the m observation distributions that are dependent on the hidden Markov state.

markov_chain

a vector object containing the generated sequence of states of the hidden Markov chain of the HMM.

means_along_markov_chain

a vector object containing the sequence of means (of the state dependent distributions) corresponding to the generated sequence of states.

observations

a vector object containing the generated sequence of (state dependent) observations of the HMM.

Note

Some notes regarding the default values:

gamma:
The default setting assigns higher probabilities for remaining in a state than changing into another.

obs_range:
Has to be used with caution. since it manipulates the results of the HMM. If a value for an observation at time t is generated outside the defined range, it will be regenerated as long as it falls into obs_range. It is possible just to define one boundary, e.g. obs_range=c(NA,2000) for observations lower than 2000, or obs_range=c(100,NA) for observations higher than 100.

obs_round :
Has to be used with caution! Rounds each generated observation and hence manipulates the results of the HMM (important for the normal distribution based HMM).

obs_ non_neg:
Has to be used with caution, since it manipulates the results of the HMM. If a negative value for an observation at a time t is generated, it will be re-generated as long as it is non-negative (important for the normal distribution based HMM).

Author(s)

Vitali Witowski (2013).

See Also

AIC_HMM, BIC_HMM, HMM_training

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
################################################################
### i.) Generating a HMM with Poisson-distributed data #########
################################################################

Pois_HMM_data <- 
   HMM_simulation(size = 300, 
      m = 5, 
      distribution_class = "pois", 
      distribution_theta = list( lambda=c(10,15,25,35,55)))

print(Pois_HMM_data)

################################################################
### ii.) Generating 6 physical activities with normally ########
###      distributed accelerometer counts using a HMM. #########
################################################################

## Define number of time points (1440 counts equal 6 hours of 
## activity counts assuming an epoch length of 15 seconds).
size <- 1440

## Define 6 possible physical activity ranges
m <- 6

## Start with the lowest possible state 
## (in this case with the lowest physical activity)
delta <- c(1, rep(0, times = (m - 1)))

## Define transition matrix to generate according to a 
## specific activity 
gamma <- 0.935 * diag(m) + rep(0.065 / m, times = m)

## Define parameters 
## (here: means and standard deviations for m=6 normal 
##  distributions that define the distribution in 
##  a phsycial acitivity level)
distribution_theta <- list(mean = c(0,100,400,600,900,1200), 
   sd = rep(x = 200, times = 6))

### Assume for each count an upper boundary of 2000
obs_range <-c(NA,2000)

### Accelerometer counts shall not be negative
obs_non_neg <-TRUE

### Start simulation

accelerometer_data <- 
   HMM_simulation(size = size, 
     m = m, 
     delta = delta, 
     gamma = gamma, 
     distribution_class = "norm", 
     distribution_theta = distribution_theta, 
     obs_range = obs_range, 
     obs_non_neg= obs_non_neg, plotting=0)

print(accelerometer_data)

HMMpa documentation built on May 2, 2019, 7:58 a.m.