options(width = 999)
knitr::opts_chunk$set(comment = "#>", collapse = TRUE)

Package stat2003 provides some functions about discrete-time Markov chains and linear birth-death-immigration-emigration processes (linear BDIE processes), which is suitable for students who are studying STAT2003 at UCL. This package can help students to study the course and understand Markov chains more clearly. There are some simulation functions in this package, which can simulate discrete-time Markov chains and linear BDIE processes returning a graph. Simulation functions are really useful for students, because theorems studied in the lecture are quite abstract, students can input Markov chains into simulation functions by themselves and see how the input Markov chain behaves. Students can also use stat2003 to check their exercise answers. There are many calculation functions in this package, which can be used to calculate invariant distributions and equilibrium distributions, to classify the behaviour of each state, and to group states together into irreducible classes. There are 10 functions included in the package. Note: this vignette only focuses on how to use the functions in this package, and for more details about the arguments required by each function, please refer to the help files. Also, please refer to the STAT2003 notes for definitions of technical terms and theoretical results.

Functions for discrete-time finite statespace Markov chain

The class feature is really useful in R. It is like a storage container. In the package stat2003, class stat2003.d is designed to store a homogeneous discrete-time finite statespace Markov chain. Class stat2003.d requires transition matrix, initial distribution and statespace of the Markov chain. stat2003.d is a general base for every function which is used for discrete-time Markov chain in this package. Therefore, before using the functions that starts with dmc_, users should input the discrete-time Markov chain into class stat2003.d first. There are six functions for discrete-time Markov chain in this package.

Function dmc_irreclass() focuses on finding the irreducible classes for a given finite discrete-time Markov chain. This function will return the number of irreducible classes in the input chain as well as states of each class. It can identify whether classes are closed and calculate the period of each class and also infer whether the class is transient or (positive) recurrent (finding the period of each state is based on another function dmc_period()).

Function dmc_inv() and dmc_equi() focus on the limit behaviours of the input chain. Like their names, dmc_inv() and dmc_equi() are used to find invariant distributions and equilibrium distributions respectively. Moreover, for user's convenience, dmc_inv() is also empowered to calculate the period for each closed class, which is similar to a part of function dmc_irreclass.

Function dmc_tp() can calculate transient probabilities at a specific step for a discrete-time Markov chain, and can also return a black-white graph. The graph describes transient probabilities in the grey levels. In this graph, colours are always in direct proportion to the probabilities, for instance, probability "0" indicates a "white" square and probability "1"" indicates a "black" square.

Function dmc_simu() is used to simulate the input chain by returning a sequence of possible states at each step and it can also return a plot of states against steps.

Using the code

Consider a discrete-time finite statespace Markov chain, with statespace $S \in {A, B, C, D, E, F, G}$, initial distribution $(1/7, 1/7, 1/7, 1/7, 1/7, 1/7, 1/7)$ and transition matrix $P$ $$ \mathbf{P} = \left(\begin{array} {rrrrrrr} 0 & 1 & 0 & 0 & 0 & 0 & 0 \ 0 & 0 & 1 & 0 & 0 & 0 & 0 \ 0 & \frac13 & 0 & \frac23 & 0 & 0 & 0 \ 0 & 0 & 0 & 0 & \frac23 & \frac13 & 0 \ 0 & \frac23 & 0 & \frac13 & 0 & 0 & 0 \ 0 & 0 & 0 & 0 & 0 & \frac13 & \frac23 \ 0 & 0 & 0 & 0 & 0 & \frac14 & \frac34
\end{array}\right). $$

devtools::load_all() # reload all code (after saving them)   or Ctrl-shift-L
library(stat2003)
P <- matrix(c(0, 1, 0, 0, 0, 0, 0,
              0, 0, 1, 0, 0, 0, 0,
              0, 1/3, 0, 2/3, 0, 0, 0,
              0, 0, 0, 0, 2/3, 1/3, 0,
              0, 2/3, 0, 1/3, 0, 0, 0,
              0, 0, 0, 0, 0, 1/3, 2/3,
              0, 0, 0, 0, 0, 1/4, 3/4), nr = 7, nc=7, byrow = T)

# input the transition matrix, initial distribution and statespace to
# create a new discrete-time finite statespace Markov chain in the class stat2003.d
MC<- new("stat2003.d", p_start = c(1/7, 1/7, 1/7, 1/7, 1/7, 1/7, 1/7), p = P,
         statespace = LETTERS[1 : 7] )

Using dmc_irreclass() to find the irreducible classes of MC.

dmc_irreclass(MC)

From the output above, we can see that there are three irreducible classes in MC. Specially, there is only one closed class which is also ergodic (aperiodic and positive recurrent). Therefore the equilibrium distribution of this Markov chain exists, which also means that there is only one unique invariant distribution which is the same as the equilibrium distribution. Let us find out more by using dmc_inv() and dmc_equ().

dmc_inv(MC, show_mat = FALSE)
dmc_equi(MC)

From the output above, we can see that same conclusion as we just made, equilibrium distribution of this Markov chain should exist and it is the same as the unique invariant distribution. Moreover, the output about closed class returned by function dmc_inv() corresponds with the result of function dmc_irreclass(). Let's use dmc_tp() to calculate and simulate transient probabilities at $10^{th}$ step of this Markov chain.

dmc_tp(MC, 10, graph = TRUE)

From the graph and output we can conclude that, if initial distribution is $(1/7, 1/7, 1/7, 1/7, 1/7, 1/7, 1/7)$, the chain is most likely to go to state G at step ten, and at the same time, it is not possible to go to state A.

dmc_simu(MC, 30)

The above result shows a possible path of this Markov chain. Function dmc_simu() can also be used to check whether the output of the function dmc_equi() is correct. After simulating the input Markov chain in the long-run, we can record the path after a big enough time, and then use the path we recorded to calculate the probabilities.

simu <- dmc_simu(MC, 10000)
table(simu[5000:10000]) / sum(table(simu[5000:10000]))

We can see that the probabilities we calculated is almost similar as the output of function dmc_equi(). (The reason why there is a black area in the simulation graph above is that the x-axis is not long enough, therefore simulation pathes are overlopping with each other.)

An example from a past paper.

EXAM2014 A1 A Markov Chain, ${X_0, X_1, \dots }$ with states ${1, 2, 3, 4}$ has transition matrix

$$ \mathbf{P} = \left(\begin{array} {rrrr} \frac13 & 0 & \frac23 & 0 \ 0 & 0 & \frac12 & \frac12 \ 1 & 0 & 0 & 0 \ 0 & \frac14 & 0 & \frac34
\end{array}\right). $$

  1. Find the irreducible classes of intercommunicating states and classify them in terms of positive or null recurrence, transience, periodicity and ergodicity.
  2. Calculate the invariant distribution for this chain.
  3. Calculate $\mu_1$, the mean number of steps for the first return to state 1, given that one starts there.
  4. Calculate $P(X_4 = 2 | X_0 = 2)$.

This is a very straightforward question, we can use the functions in this package to answer it directly.

# we have to input this Markov chain into the class stat2003.d first.
p <- matrix(c(1/3, 0, 2/3, 0,
             0, 0, 0.5, 0.5,
             1, 0, 0, 0,
             0, 1/4, 0, 3/4), nr = 4, byrow = TRUE)

# Note that, p_start is set to (0, 1, 0, 0), which is corresponding to P(X_0 = 2) = 1 in part 4 of the question.
A <- new("stat2003.d", p_start = c(0, 1, 0, 0), p = p,
          statespace = as.character(1:4))

#(a)
dmc_irreclass(A)

#(b)
dmc_inv(A, show_mat = FALSE)

#(c)
# From the lecture slides
# we know that the mean recurrence time of state 1 is 
# the reciprocal of the first element of invariant distribution
# if the Markov chain is an irreducible ergodic Markov chain.

# from the answer of (a), we know that 
# this Markov chain is an irreducible ergodic Markov chain,
# also based on the answer of (b),
# mean recurrence time of state 1 is
1/0.6

#(d)
# we have already set the initial distribution to (0, 1, 0, 0),
# therefore, we can calculate it directly
dmc_tp(A, 4)
# answer is 0.0859375 

Functions for linear birth-death-immigration-emigration processes

Class stat2003.bd in the package stat2003 is designed to store a linear BDIE process. Users can input the linear birth and death rate, immigration and emigration rate and statspace size to create a linear birth-death process in the class stat2003.bd. Similarly to stat2003.d, stat2003.bd is a general base for every function which is used for linear birth-death processes in this package. Therefore, before using the functions that starts with bd_, users should input the linear birth-death process into class stat2003.bd first. There are four functions for linear BDIE processes in this package.

First of all, there are two matrix functions. Function bd_rate() is used to produce generator matrices (rate matrices) and bd_trans() is used to produce transition matrices of jump chains of the input linear BDIE process.

The limit behaviour of linear BDIE process depends on some situations, like whether the process is irreducible (for example, if immigration rate equals to zero, then state 0 is an absorbing state and the process is not irreducible), therefore finding the equilibrium distribution in continuous-time Markov processes is not as easy as it is in discrete-time case. Function bd_equi() can help users to find the equilibrium distribution easly and quickly.

Function bd_simu() is the simulation function in continuous-time case. Same as function dmc_simu(), it simulates the input process by returning a sequence of possible states at each step and a plot of states against steps. Function bd_simu() focuses not only on finite statespace linear birth-death processes, but also infinite statespace linear birth-death processes.

Using the code

Consider an irreducible linear birth-death process, 'BD1', with statespace {0, 1, 2, 3, 4, 5} (therefore the size of its statspace is 6). The linear birth and death rate, immigration rate and emigration rate of 'BD1' are 2.5, 3, 5 and 1 respectively. First of all, we need to input this process into the class stat2003.bd

BD1 <- new("stat2003.bd", lb = 2.5, ld = 3, im = 5, em = 1,
            nstate = 6)

Now, let us use function bd_rate() and bd_trans() to find the generator matrix and the transition matrix of jump chains of 'BD1'.

bd_rate(BD1)
bd_trans(BD1)

The equilibrium distribution is:

bd_equi(BD1)

Now, consider another linear birth-death process, 'BD2'. 'BD2' is the same as 'BD1' but with zero immigration rate, therefore 'BD2' is a non-irreducible linear birth-death process.

BD2 <- new("stat2003.bd", lb = 2.5, ld = 3, im = 0, em = 1,
            nstate = 6)
bd_rate(BD2)
bd_trans(BD2)
bd_equi(BD2)

From the output of function bd_trans(), we can see that, the state 0 is an absorbing state. The linear birth rate is smaller than linear death rate, therefore the equilibrium distribution of 'BD2' is (1, 0, 0, 0, 0).

Using function bd_simu(), we can get a possible path of the processes 'BD1' and 'BD2'

bd_simu(BD1, 15, ini_state = 4, label = 5)
bd_simu(BD2, 15, ini_state = 4, label = 5)

We can see that process 'BD2' will be absorbed into state 0 eventually, which is correspond with the equilibrium distribution we just calculated by using bd_equi(). Consider another linear birth-death process, 'BD3'. 'BD3' is the same as 'BD1' but with infinite statspace.

BD3 <- new("stat2003.bd", lb = 2.5, ld = 3, im = 5, em = 1,
            nstate = 0)
simu <- bd_simu(BD3, 15, ini_state = 4, label = 5)

Compared with simulation graph of process 'BD1', 'BD3' is more changeful and have no limit of population number.



paulnorthrop/stat2003 documentation built on May 24, 2019, 10:31 p.m.