#| purl = FALSE, #| include = FALSE # read vignette source chunks from corresponding testthat script knitr::read_chunk(file.path("..", "tests", "testthat", "test-model-TKR.R")) # read vignette build utility functions knitr::read_chunk(file.path("vutils.R"))
#| purl = FALSE, #| include = FALSE knitr::opts_chunk$set( collapse = TRUE, echo = FALSE, comment = "#>" ) pander::panderOptions("table.split.table", 200L)
#| gbp
#| gv2png
#| purl = FALSE #nolint start
library(rdecision)
#| purl = FALSE #nolint end
The article by Dong and Buxton, 2006 [-@dong2006] describes a Markov model for
the assessment of a new Computer-Assisted Surgery (CAS) technology for Total
Knee Replacement (TKR). This vignette follows their model to calculate the
cost-effectiveness of conventional and computer-assisted TKR surgery using
rdecision
.
The authors present both a point estimate and a probabilistic interpretation.
Both approaches are shown to highlight the similarities and differences when
using rdecision
.
The patient journey is represented by 9 possible states, with each patient assigned to exactly one state at any time:
#| state-names, #| echo = TRUE
To assess cost effectiveness, the model must incorporate utilities and costs for each state.
Utilities apply in a continuous manner: the longer the time a patient
spends in a particular state, the longer a particular utility value applies.
This is represented in rdecision
by assigning the utility to a particular
MarkovState
. This ensures that they are correctly scaled to different cycle
lengths, and that the annual discount rate is applied correctly.
#| utils-point, #| echo = TRUE
Costs, in this case, represent one-time expenses associated with the
delivery of a surgical procedure, such as a TKR operation or a revision. As
such, they do not depend on the time the patient spends in a particular state,
and are represented by a cost assigned to a Transition
. Ongoing costs that
apply as long as the patient remains in a state, for example medication, would
instead be assigned to the relevant MarkovState
. The costs associated with
revisions/further treatments (states E, F and G) can be assigned to
the transitions from any other state into these states. For the cost of the
primary TKR operation (state A), the Markov model is designed with no
incoming transitions to the state, so the cost of this operation can be
included into the model by linking it to the transitions from A to other
states. When applying costs to transitions, it's important to consider the
possible transitions and ensure that no "double-counting" occurs (for example
by transitions that return to the same state).
#| costs-point, #| echo = TRUE
The initial deterministic analysis uses numerical best estimates for each parameter (cost, utility, transition probability etc.). This results in a model with an exact solution, but no estimate of the uncertainty, or of the impact of individual parameters on the uncertainty of the outcome.
The Markov state transition model specifies which transitions are permitted
between states. Each of the 9 states is represented in rdecision
by a
MarkovState
object, with a description, a state cost and a state utility.
Each allowed transition is represented by a Transition
object, defined by
the source state, the target state, and a transition cost. In all cases, cost
defaults to 0.0 if not set, and utility to 1.0.
In their model, Dong and Buxton incorporate the cost of the CAS technology as
an additional cost attached to interventions - in this case, this would affect
cost_A
, cost_E
and cost_F
(but not cost_G
which represents non-surgical
procedures). Utilities are unchanged and therefore the same Markov state
definitions can be used by both models.
#| SMM-point, #| echo = TRUE
To fully define a semi Markov model, we assign the States, Transitions, duration
of a cycle (as a difftime
object) and, optionally, annual discount.cost
and discount.utility
.
#| SMM-def-point, #| echo = TRUE
#| purl = FALSE, #| echo = TRUE pander::pander( SMM_base$tabulate_states()[, c("Name", "Utility")], justify = "ll" )
The SemiMarkovModel
object can be represented graphically in the DOT format
using the as_DOT()
method. The DiagrammeR
package can then be used to plot
DOT files with the grViz
function.
#| fig.cap = "Markov state transition model for total knee replacement (TKR)", #| echo = TRUE, #| purl = FALSE, #| eval = FALSE DiagrammeR::grViz(SMM_base$as_DOT(rankdir = "TB", width = 7.0, height = 7.0))
# images created with dot are more compact than DiagrammeR. pngfile <- gv2png( dot = SMM_base$as_DOT(rankdir = "TB", width = 7.0, height = 7.0) ) knitr::include_graphics(pngfile)
For each allowed transition between a pair of states, probabilities are
reported in Table 2. These are added to the model using the
set_probabilities
method, which takes for input a numerical matrix of size
N(states) x N(states).
For death outcomes, the probabilities are reported for primary TKR, TKR revision and all reasons: for each state, the relevant probability of death (transition to state I) will therefore be death(all reasons) + death(primary TKR or TKR revision). Note that Table 5 reports the death rates for TKR-specific death (primary or revision), which can be recovered by introducing separate Markov states in the model to represent the 3 possible causes of death, each with relevant transitions.
The total sum of all outgoing transitions for each state must be 1, so one
transition per state should be calculated as 1 - sum(all other transitions).
This is also verified by the set_probabilities
method that will return an
error if the probability matrix does not have row-wise sums of 1. Alternatively,
exactly one cell in each row can be left undefined as NA
and it will
automatically be populated to ensure a row-wise total of 1.
#| transitions-point, #| echo = TRUE
The CAS technology affects patient outcomes by reducing the probability of
serious complications by about 34%. This applies to all transition
probabilities into state B, which correspond to column 2 in the transition
matrix. Note that this also requires adjusting the row-wise sum to 1, by
increasing the probability of making a transition to another state. It is
assumed the states with increased transition probabilities are those defined as
NA
in matrix Pt
. Function txeffect
adjusts the transition probability
matrix, given a relative risk.
#| tx-effect, #| echo = TRUE
The effect is applied to the transition matrix for conventional surgery to get the transition matrix for computer-assisted surgery, as follows.
#| CAS-transitions-point, #| echo = TRUE
The simulation described in the paper starts with 1000 patients and progresses
for 10 years, or 120 one-month cycles. At the start of the model, all patients
are in state A. The initial state of the simulation can be entered into
the model by applying the reset
function with a named array describing the
number of simulated patients for each named state (in this case, 1000 in state
A and none in the other states).
The cycles
function simulates the progress of the initial population through
the specified number of cycles, and, optionally, can apply half-cycle
correction to population and cost (hcc.pop
and hcc.cost
). In this example,
half-cycle correction is not used.
#| cycle-point, #| echo = TRUE
Running the cycles
function returns a data frame with the population in each
state, the progress of the simulation (in cycles and years), the total cost
(including both transition and occupancy costs), and the calculated QALY. Costs
and QALYs are discounted by the rates specified in the model definition and are
normalised by the initial population.
The table below displays the first few cycles of the model, showing the transition of patients from state A at cycle 0 (starting state) through the 3 allowed transitions to states B, C and D, and subsequently to further states depending on their assigned transition probabilities. Each cycle accrues a cost (the cost of TKR surgery is attached here to cycle 1, as it was assigned to the transitions out of A) and this, with the utilities associated with each state, can be used to calculate the QALY.
#| purl = FALSE pander::pander(SMM_base_10years[0L:5L, ])
This progression can also be visualised using the base graphics function
barplot
:
#| fig.cap = "State occupancy progression for conventional surgery.", #| purl = FALSE plot_occupancy <- function(SMM) { withr::with_par( new = list(mar = c(5L, 4L, 1L, 14L) + 0.1), code = { states <- colnames(SMM)[ !colnames(SMM) %in% c("Cost", "QALY", "Cycle", "Years") ] pal <- sample(hcl.colors(length(states), palette = "Spectral")) barplot( t(as.matrix(SMM[, states])), xlab = "Cycle", ylab = "States", names.arg = SMM[, "Cycle"], col = pal, border = NA, space = 0L ) legend( "topright", legend = states, fill = pal, bty = "n", inset = c(-0.75, 0.0), xpd = TRUE ) } ) } plot_occupancy(SMM_base_10years)
Dong and Buxton [-@dong2006] report results yearly (corresponding to cycles 12,
24, 36 etc.) with some cumulative values. The cycles
function of
SemiMarkovModel
returns a Markov trace (state occupancies at the end of each
cycle, with per-cycle costs and QALYs) in the form of a data frame. The Markov
trace can be converted to their Table 4 form via the function as_table4
.
#| as-table4, #| echo = TRUE
Replication of their Table 4 is given in the next two sections. Note that these figures do not match exactly the published figures for costs and QALYs. This might be a different application of the discount rate, or rounding errors associated with the representation of currency in Excel. Small differences in the CAS calculations can also probably be traced to rounding of the 34% CAS effect figure.
#| cs-table-4, #| echo = TRUE
#| purl = FALSE pander::pander( t4_CS, round = 2L, row.names = FALSE, big.mark = ",", justify = "lrrrrrrr", keep.trailing.zeros = TRUE )
#| cas-table-4, #| echo = TRUE
#| purl = FALSE pander::pander( t4_CAS, round = 2L, row.names = FALSE, big.mark = ",", justify = "lrrrrrrr", keep.trailing.zeros = TRUE )
PSA adds a randomisation-based process to estimate not only the central value of the estimate, but also its uncertainty. This is done by representing individual parameters not as single numerical values but as distributions, from which a range of possible values can be extracted at random. The choice of distribution type and parametrisation depend on the type of variable and the values reported in the literature or observed in a relevant patient cohort.
These are constrained to a value between 0 and 1, and have been described with a Beta distribution:
A Beta function, with the mean equal to the point estimate and a high variance to reflect the uncertainty, was used to generate a random utility for each Markov state
The specific parameters for the distributions are not reported in the paper, where instead a point estimate and a range (95% CI) are reported for each utility distribution.
The Beta distribution can be parametrised using mean $\mu$ and sample size $\nu$ (which relate to the more common parameters as $\alpha = \mu * \nu$ and $\beta = (1 - \mu) * \nu)$, using the point estimate as mean and assessing numerically the sample size until the confidence intervals match the reported range. Not all the ranges provided map to a valid Beta distribution with the given mean, suggesting that the authors used a different method to derive their Beta distributions. Note also that the authors report that
To ensure a plausible relationship between the utilities of states “Normal health after primary TKR,” “TKR with minor complications,” and “TKR with serious complications,” the process was structured so that the hierarchical relationship was retained but the differences between the utilities were randomly drawn from the distribution.
but there is no description of the method used and/or how the resulting values are constrained to the [0, 1] interval.
#| utilities-var, #| echo = TRUE
These parameters can now be used to describe the distributions that each
utility value should be drawn from. This is represented in rdecision
by a
ModVar
variable, in this case specifically the BetaModVar
child class. This
type of ModVar
requires the alpha
and beta
parameters, which can be
calculated according to the equations above. Note that for state I (Death),
there is no need to define a ModVar
, as the utility of this state is not
associated with any uncertainty.
#| utilities-beta, #| echo = TRUE
These can take any non-negative value, and have been described with a Gamma distribution:
Variance for the Gamma distribution was estimated from the ranges. A Gamma function was used to generate a random cost for each Markov state.
These distributions are also reported as means and 95% CI. In this case, the means of the Gamma distributions are sufficiently far from 0 that the approximation to a Gaussian distribution is acceptable to estimate the value of the standard deviation.
The GammaModVar
class requires the parameters shape
and scale
, which
relate to mean $\mu$ and variance $\sigma^2$ as $shape = \mu^2 / \sigma^2$
and $scale = \sigma^2 / \mu$.
#| costs-gamma, #| echo = TRUE
The cost of CAS is also modelled with a Gamma distribution, but in this case the authors estimate the distribution parameters directly:
We assumed that the ratio of its standard deviation over mean was four times as large as that of conventional primary TKR
As a result, the total cost of a surgical procedure is the sum of the surgery
cost and the CAS cost, each of which is independently drawn from a defined
gamma distribution. This can be easily represented in rdecision
using an
ExprModVar
: a model variable that uses other model variables as operands.
The expression needs to be quoted using the rlang::quo
command in order to
be stored rather than executed straight away.
#| cas-cost-gamma, #| echo = TRUE
The CAS effect in this model is a reduction in the transition probabilities to state B, and was modelled using a lognormal distribution:
Lognormal function was used to generate a random “effect of CAS.” The variance of the “effect of CAS” was estimated from the clinical trials
Because a lognormally distributed value is always positive, the resulting transition probabilities will be reduced by a non-zero fraction (note however that values > 100% are possible, which would result an invalid negative transition probability).
While it is possible to reconstruct the exact parameters of a lognormal distribution given a mean and 95% CI, the article does not report a range for this variable. Because exact replication is therefore impossible, an arbitrary parametrisation is used here that is unlikely to yield an illegal value.
#| cas-lognorm, #| echo = TRUE
The same structure as the model used for the deterministic solution can be used again, but in this case the costs and utilities will be represented by distributions.
#| SMM-PSA, #| echo = TRUE
The current model includes a number of model variables, that can be displayed
using the modvar_table
method. This tabulation can also be used to compare
the 95% CI with the values reported in Table 3.
#| purl = FALSE, #| echo = TRUE pander::pander( SMM_base_PSA$modvar_table()[, c(-2L, -5L, -9L)], justify = "llrrrr", round = 5L )
Note that the list of ModVar
s associated with the CAS model includes not only
the variables that were explicitly assigned to States and Transitions, such as
Utility of state A, but also the implicit ModVar
s underlying ExprModVar
s,
such as Cost of CAS, which is used to calculate Cost of state A with CAS.
#| purl = FALSE, #| echo = TRUE pander::pander( SMM_CAS_PSA$modvar_table()[, c(-2L, -5L, -9L)], justify = "llrrrr", round = 5L )
To generate a logical multi-Markov state probabilistic transition matrix from the initial point estimates, the Dirichlet distribution was used (11). We estimated a count for each Markov state by the transition probabilities (we assumed that the total counts equalled 1,000). We used random number and Gamma distribution formulae to generate a one-parameter (standard) Gamma distribution for each cell of the transition matrix. The one-parameter Gamma distribution in Excel was obtained by setting the first (alpha) parameter equal to the estimated count and the second (beta) parameter equal to 1. The final step was to “normalize” the realizations from the Gamma distribution back to the 0–1 scale by dividing each realization through by the corresponding row total.
The complex description above is an approximation of a Dirichlet distribution
when constrained by the functionalities of Excel. In rdecision
, variables
drawn from a multinomial probabilistic distribution can be defined using
in-built functions:
DirichletDistribution
from the known counts or
proportions for each exiting transition. If using proportions, a population
should also specified to correctly assign variance to the probabilistic
distribution. In this example, the authors assumed a population of 1000.sample()
method) and populate the allowed
transitions with numerical values (r()
method). This is performed by
the dirichletify
function.#| fun-dirichlet, #| echo = TRUE
The following code executes 250 runs of the model for conventional surgery and for computer-assisted surgery, each of 120 cycles. For each run the set of model variables are sampled from their uncertainty distributions. The Markov traces for each run, in matrix form, are converted to the format used by Dong et al in their Table 4, and added to 3D arrays which store the traces from all the runs, separately for conventional and computer-assisted surgery.
The transition probability matrix is sampled from an assumed matrix of counts
using the dirichletify
function.
In the case of CAS, there is an additional step in the simulation process due
to the fact that the transition probabilities into state B (column 2 in the
transition matrix) are modified by a multiplier, Effect of CAS, which is
itself drawn from a distribution. In this case, the relevant ModVar
needs to be
randomised at every cycle, and applied to the transition matrix.
This a "paired" simulation; that is, we are interested in the difference between computer-assisted surgery and conventional surgery, taking account of the variation in inputs. In this model, many variables are common to both models, and we make sure to randomise all variables once, before running each model, every cycle. This is analogous, in statistics, to a paired t-test rather than a two-sample t-test. When there are many common variables, the results of the two models will be correlated.
#| cycle-PSA, #| echo = TRUE
The means of the cohort simulation results should be very similar, but not necessarily identical, to those in Table 4 (and replicated analytically above). Each run of this script is expected to yield a slightly different result. The addition of probabilistic elements can be used to describe not only the means (which should approximate the deterministic cohort model results), but also the distributions of the estimates, giving the uncertainties associated with the two models.
In the following two sections, the mean of all runs are presented as per Table 4 of Dong & Buxton [-@dong2006] and the uncertainties are presented as per their Table 5, with the addition of the upper and lower confidence limits of the mean.
#| purl = FALSE, #| echo = TRUE t4_CS_PSA_mean <- apply(t4_CS_PSA, MARGIN = c(1L, 2L), FUN = mean) pander::pander( t4_CS_PSA_mean, round = 2L, row.names = FALSE, big.mark = ",", justify = "lrrrrrrr", keep.trailing.zeros = TRUE )
#| t5-CS-PSA, #| echo = TRUE
#| purl = FALSE pander::pander(t5_CS, round = 2L, keep.trailing.zeros = TRUE, justify = "lrrrr")
#| purl = FALSE, #| echo = TRUE t4_CAS_PSA_mean <- apply(t4_CAS_PSA, MARGIN = c(1L, 2L), FUN = mean) pander::pander( t4_CAS_PSA_mean, round = 2L, row.names = FALSE, big.mark = ",", justify = "lrrrrrrr", keep.trailing.zeros = TRUE )
#| t5-CAS-PSA, #| echo = TRUE
#| purl = FALSE, #| echo = TRUE pander::pander( t5_CAS, round = 2L, justify = "lrrrr", keep.trailing.zeros = TRUE )
Note that the means correlate well with those reported in Table 5, but the assertion that "[CAS] had lower variances for each indicator" is not reliably replicated for every variable. This is likely to be due to the fact that the parameters for the distribution of the Effect of CAS could not be extracted, and the values chosen here may assign excessive uncertainty to this variable.
#| cea-psa, #| echo = TRUE
From PSA, after 10 years the mean (95% CI) cost change per patient was
r round(mean(dcost_psa), digits = 2L)
(r round(quantile(dcost_psa, probs = 0.025), digits = 2L)
to
r round(quantile(dcost_psa, probs = 0.975), digits = 2L)
) £,
and the mean change in utility was
r round(mean(dutil_psa), digits = 2L)
(r round(quantile(dutil_psa, probs = 0.025), digits = 2L)
to
r round(quantile(dutil_psa, probs = 0.975), digits = 2L)
) QALY, compared
with the deterministic values of -£583 and +0.0148 respectively, as reported in
the paper. The incremental cost effectiveness ratio was less than
£30,000 per QALY for
r round(100 * sum(icer_psa < 30000) / length(icer_psa), 2L)
% of runs. The
distribution of results on the cost effectiveness plane is shown in the
figure below (for comparison with figure 2 of the paper).
#| fig.cap = "Cost-effectiveness plane. QALY = Quality-adjusted life year", #| purl = FALSE withr::with_par( new = list(mar = c(1L, 2L, 2L, 1L)), code = { plot( dcost_psa ~ dutil_psa, pch = 20L, xlim = c(-0.1, 0.15), ylim = c(-5000.0, 1000.0), axes = FALSE, xlab = "", ylab = "" ) axis(side = 1L, pos = 0.0, las = 1L) axis(side = 2L, pos = 0.0, las = 1L) mtext( text = expression(paste(Delta, "QALYs")), side = 2L, adj = 5L / 6L ) mtext( text = expression(paste(Delta, "Cost (£)")), side = 3L, adj = 2L / 5L ) } )
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.