This provides a rough indication of the goodness of fit of a multistate model, by estimating the observed numbers of individuals occupying each state at a series of times, and comparing these with forecasts from the fitted model.
1 2 3 4 
x 
A fitted multistate model produced by 
times 
Series of times at which to compute the observed and expected prevalences of states. 
timezero 
Initial time of the Markov process. Expected values are forecasted from here. Defaults to the minimum of the observation times given in the data. 
initstates 
Optional vector of the same length as the number of states. Gives the numbers of individuals occupying each state at the initial time, to be used for forecasting expected prevalences. The default is those observed in the data. These should add up to the actual number of people in the study at the start. 
covariates 
Covariate values for which to forecast expected
state occupancy. With the default Predictions for fixed covariates can be obtained by supplying
covariate values in the standard way, as in

misccovariates 
(Misclassification models only)
Values of covariates on the misclassification probability matrix
for converting expected true to expected misclassified states.
Ignored if 
piecewise.times 
Times at which piecewiseconstant intensities
change. See 
piecewise.covariates 
Covariates on which the piecewiseconstant
intensities depend. See 
ci 
If If If 
cl 
Width of the symmetric confidence interval, relative to 1 
B 
Number of bootstrap replicates 
cores 
Number of cores to use for bootstrapping using parallel
processing. See 
interp 
Suppose an individual was observed in states S_{r1} and S_r at two consecutive times t_{r1} and t_r, and we want to estimate 'observed' prevalences at a time t between t_{r1} and t_r. If If 
censtime 
If the time is greater than This can be supplied as a single value, or as a vector with one
element per subject (after any This is ignored if it is less than the subject's maximum observation time. 
subset 
Subset of subjects to calculate observed prevalences for. 
plot 
Generate a plot of observed against expected
prevalences. See 
... 
Further arguments to pass to 
The fitted transition probability matrix is used to forecast expected prevalences from the state occupancy at the initial time. To produce the expected number in state j at time t after the start, the number of individuals under observation at time t (including those who have died, but not those lost to followup) is multiplied by the product of the proportion of individuals in each state at the initial time and the transition probability matrix in the time interval t. The proportion of individuals in each state at the "initial" time is estimated, if necessary, in the same way as the observed prevalences.
For misclassification models (fitted using an ematrix
), this
aims to assess the fit of the full model for the observed
states. That is, the combined Markov progression model for the true
states and the misclassification model. Thus, expected prevalences of true
states are estimated from the assumed proportion
occupying each state at the initial time using the fitted transition
probabiliy matrix. The vector of expected prevalences of true states
is then multiplied by the fitted misclassification probability matrix
to obtain the expected prevalences of observed states.
For general hidden Markov models, the observed state is taken to be the
predicted underlying state from the Viterbi algorithm
(viterbi.msm
). The goodness of fit of
these states to the underlying Markov model is tested.
For an example of this approach, see Gentleman et al. (1994).
A list of matrices, with components:
Observed 
Table of observed numbers of individuals in each state at each time 
Observed percentages 
Corresponding percentage of the individuals at risk at each time. 
Expected 
Table of corresponding expected numbers. 
Expected percentages 
Corresponding percentage of the individuals at risk at each time. 
Or if ci.boot = TRUE
, the component Expected
is a list
with components estimates
and ci
.
estimates
is a matrix of the expected prevalences, and
ci
is a list of two matrices, containing the confidence limits.
The component
Expected percentages
has a similar format.
C. H. Jackson chris.jackson@mrcbsu.cam.ac.uk
Gentleman, R.C., Lawless, J.F., Lindsey, J.C. and Yan, P. Multistate Markov models for analysing incomplete disease history data with illustrations for HIV disease. Statistics in Medicine (1994) 13(3): 805–821.
Titman, A.C., Sharples, L. D. Model diagnostics for multistate models. Statistical Methods in Medical Research (2010) 19(6):621651.
msm
, summary.msm
Questions? Problems? Suggestions? Tweet to @rdrrHQ or email at ian@mutexlabs.com.
All documentation is copyright its authors; we didn't write any of that.