Markov model with probabilistic sensitivity analysis

#| purl = FALSE,
#| include = FALSE
# read vignette source chunks from corresponding testthat script
knitr::read_chunk(file.path("..", "tests", "testthat", "test-model-TKR.R"))
# read vignette build utility functions
knitr::read_chunk(file.path("vutils.R"))
#| purl = FALSE,
#| include = FALSE
knitr::opts_chunk$set(
  collapse = TRUE,
  echo = FALSE,
  comment = "#>"
)
pander::panderOptions("table.split.table", 200L)
#| gbp
#| gv2png
#| purl = FALSE
#nolint start
library(rdecision)
#| purl = FALSE
#nolint end

The article by Dong and Buxton, 2006 [-@dong2006] describes a Markov model for the assessment of a new Computer-Assisted Surgery (CAS) technology for Total Knee Replacement (TKR). This vignette follows their model to calculate the cost-effectiveness of conventional and computer-assisted TKR surgery using rdecision.

The authors present both a point estimate and a probabilistic interpretation. Both approaches are shown to highlight the similarities and differences when using rdecision.

Model description

States

The patient journey is represented by 9 possible states, with each patient assigned to exactly one state at any time:

#| state-names,
#| echo = TRUE

Variables

To assess cost effectiveness, the model must incorporate utilities and costs for each state.

Utilities apply in a continuous manner: the longer the time a patient spends in a particular state, the longer a particular utility value applies. This is represented in rdecision by assigning the utility to a particular MarkovState. This ensures that they are correctly scaled to different cycle lengths, and that the annual discount rate is applied correctly.

#| utils-point,
#| echo = TRUE

Costs, in this case, represent one-time expenses associated with the delivery of a surgical procedure, such as a TKR operation or a revision. As such, they do not depend on the time the patient spends in a particular state, and are represented by a cost assigned to a Transition. Ongoing costs that apply as long as the patient remains in a state, for example medication, would instead be assigned to the relevant MarkovState. The costs associated with revisions/further treatments (states E, F and G) can be assigned to the transitions from any other state into these states. For the cost of the primary TKR operation (state A), the Markov model is designed with no incoming transitions to the state, so the cost of this operation can be included into the model by linking it to the transitions from A to other states. When applying costs to transitions, it's important to consider the possible transitions and ensure that no "double-counting" occurs (for example by transitions that return to the same state).

#| costs-point,
#| echo = TRUE

Point estimate

The initial deterministic analysis uses numerical best estimates for each parameter (cost, utility, transition probability etc.). This results in a model with an exact solution, but no estimate of the uncertainty, or of the impact of individual parameters on the uncertainty of the outcome.

Markov model

The Markov state transition model specifies which transitions are permitted between states. Each of the 9 states is represented in rdecision by a MarkovState object, with a description, a state cost and a state utility. Each allowed transition is represented by a Transition object, defined by the source state, the target state, and a transition cost. In all cases, cost defaults to 0.0 if not set, and utility to 1.0.

In their model, Dong and Buxton incorporate the cost of the CAS technology as an additional cost attached to interventions - in this case, this would affect cost_A, cost_E and cost_F (but not cost_G which represents non-surgical procedures). Utilities are unchanged and therefore the same Markov state definitions can be used by both models.

#| SMM-point,
#| echo = TRUE

To fully define a semi Markov model, we assign the States, Transitions, duration of a cycle (as a difftime object) and, optionally, annual discount.cost and discount.utility.

#| SMM-def-point,
#| echo = TRUE
#| purl = FALSE,
#| echo = TRUE
pander::pander(
  SMM_base$tabulate_states()[, c("Name", "Utility")], justify = "ll"
)

The SemiMarkovModel object can be represented graphically in the DOT format using the as_DOT() method. The DiagrammeR package can then be used to plot DOT files with the grViz function.

#| fig.cap = "Markov state transition model for total knee replacement (TKR)",
#| echo = TRUE,
#| purl = FALSE,
#| eval = FALSE
DiagrammeR::grViz(SMM_base$as_DOT(rankdir = "TB", width = 7.0, height = 7.0))
# images created with dot are more compact than DiagrammeR.
pngfile <- gv2png(
  dot = SMM_base$as_DOT(rankdir = "TB", width = 7.0, height = 7.0)
)
knitr::include_graphics(pngfile)

Transition probabilities

For each allowed transition between a pair of states, probabilities are reported in Table 2. These are added to the model using the set_probabilities method, which takes for input a numerical matrix of size N(states) x N(states).

For death outcomes, the probabilities are reported for primary TKR, TKR revision and all reasons: for each state, the relevant probability of death (transition to state I) will therefore be death(all reasons) + death(primary TKR or TKR revision). Note that Table 5 reports the death rates for TKR-specific death (primary or revision), which can be recovered by introducing separate Markov states in the model to represent the 3 possible causes of death, each with relevant transitions.

The total sum of all outgoing transitions for each state must be 1, so one transition per state should be calculated as 1 - sum(all other transitions). This is also verified by the set_probabilities method that will return an error if the probability matrix does not have row-wise sums of 1. Alternatively, exactly one cell in each row can be left undefined as NA and it will automatically be populated to ensure a row-wise total of 1.

#| transitions-point,
#| echo = TRUE

The CAS technology affects patient outcomes by reducing the probability of serious complications by about 34%. This applies to all transition probabilities into state B, which correspond to column 2 in the transition matrix. Note that this also requires adjusting the row-wise sum to 1, by increasing the probability of making a transition to another state. It is assumed the states with increased transition probabilities are those defined as NA in matrix Pt. Function txeffect adjusts the transition probability matrix, given a relative risk.

#| tx-effect,
#| echo = TRUE

The effect is applied to the transition matrix for conventional surgery to get the transition matrix for computer-assisted surgery, as follows.

#| CAS-transitions-point,
#| echo = TRUE

Running the model

The simulation described in the paper starts with 1000 patients and progresses for 10 years, or 120 one-month cycles. At the start of the model, all patients are in state A. The initial state of the simulation can be entered into the model by applying the reset function with a named array describing the number of simulated patients for each named state (in this case, 1000 in state A and none in the other states).

The cycles function simulates the progress of the initial population through the specified number of cycles, and, optionally, can apply half-cycle correction to population and cost (hcc.pop and hcc.cost). In this example, half-cycle correction is not used.

#| cycle-point,
#| echo = TRUE

Running the cycles function returns a data frame with the population in each state, the progress of the simulation (in cycles and years), the total cost (including both transition and occupancy costs), and the calculated QALY. Costs and QALYs are discounted by the rates specified in the model definition and are normalised by the initial population.

The table below displays the first few cycles of the model, showing the transition of patients from state A at cycle 0 (starting state) through the 3 allowed transitions to states B, C and D, and subsequently to further states depending on their assigned transition probabilities. Each cycle accrues a cost (the cost of TKR surgery is attached here to cycle 1, as it was assigned to the transitions out of A) and this, with the utilities associated with each state, can be used to calculate the QALY.

#| purl = FALSE
pander::pander(SMM_base_10years[0L:5L, ])

This progression can also be visualised using the base graphics function barplot:

#| fig.cap = "State occupancy progression for conventional surgery.",
#| purl = FALSE
plot_occupancy <- function(SMM) {
  withr::with_par(
    new = list(mar = c(5L, 4L, 1L, 14L) + 0.1),
    code = {
      states <- colnames(SMM)[
        !colnames(SMM) %in% c("Cost", "QALY", "Cycle", "Years")
      ]
      pal <- sample(hcl.colors(length(states), palette = "Spectral"))
      barplot(
        t(as.matrix(SMM[, states])),
        xlab = "Cycle", ylab = "States", names.arg = SMM[, "Cycle"],
        col = pal, border = NA, space = 0L
      )
      legend(
        "topright", legend = states, fill = pal, bty = "n",
        inset = c(-0.75, 0.0), xpd = TRUE
      )
    }
  )
}
plot_occupancy(SMM_base_10years)

Dong and Buxton [-@dong2006] report results yearly (corresponding to cycles 12, 24, 36 etc.) with some cumulative values. The cycles function of SemiMarkovModel returns a Markov trace (state occupancies at the end of each cycle, with per-cycle costs and QALYs) in the form of a data frame. The Markov trace can be converted to their Table 4 form via the function as_table4.

#| as-table4,
#| echo = TRUE

Replication of their Table 4 is given in the next two sections. Note that these figures do not match exactly the published figures for costs and QALYs. This might be a different application of the discount rate, or rounding errors associated with the representation of currency in Excel. Small differences in the CAS calculations can also probably be traced to rounding of the 34% CAS effect figure.

Conventional surgery

#| cs-table-4,
#| echo = TRUE
#| purl = FALSE
pander::pander(
  t4_CS, round = 2L, row.names = FALSE, big.mark = ",",
  justify = "lrrrrrrr", keep.trailing.zeros = TRUE
)

Computer-assisted surgery

#| cas-table-4,
#| echo = TRUE
#| purl = FALSE
pander::pander(
  t4_CAS, round = 2L, row.names = FALSE, big.mark = ",",
  justify = "lrrrrrrr", keep.trailing.zeros = TRUE
)

Probabilistic Sensitivity Analysis

PSA adds a randomisation-based process to estimate not only the central value of the estimate, but also its uncertainty. This is done by representing individual parameters not as single numerical values but as distributions, from which a range of possible values can be extracted at random. The choice of distribution type and parametrisation depend on the type of variable and the values reported in the literature or observed in a relevant patient cohort.

Parameter distributions

Utilities

These are constrained to a value between 0 and 1, and have been described with a Beta distribution:

A Beta function, with the mean equal to the point estimate and a high variance to reflect the uncertainty, was used to generate a random utility for each Markov state

The specific parameters for the distributions are not reported in the paper, where instead a point estimate and a range (95% CI) are reported for each utility distribution.

The Beta distribution can be parametrised using mean $\mu$ and sample size $\nu$ (which relate to the more common parameters as $\alpha = \mu * \nu$ and $\beta = (1 - \mu) * \nu)$, using the point estimate as mean and assessing numerically the sample size until the confidence intervals match the reported range. Not all the ranges provided map to a valid Beta distribution with the given mean, suggesting that the authors used a different method to derive their Beta distributions. Note also that the authors report that

To ensure a plausible relationship between the utilities of states “Normal health after primary TKR,” “TKR with minor complications,” and “TKR with serious complications,” the process was structured so that the hierarchical relationship was retained but the differences between the utilities were randomly drawn from the distribution.

but there is no description of the method used and/or how the resulting values are constrained to the [0, 1] interval.

#| utilities-var,
#| echo = TRUE

These parameters can now be used to describe the distributions that each utility value should be drawn from. This is represented in rdecision by a ModVar variable, in this case specifically the BetaModVar child class. This type of ModVar requires the alpha and beta parameters, which can be calculated according to the equations above. Note that for state I (Death), there is no need to define a ModVar, as the utility of this state is not associated with any uncertainty.

#| utilities-beta,
#| echo = TRUE

Costs

These can take any non-negative value, and have been described with a Gamma distribution:

Variance for the Gamma distribution was estimated from the ranges. A Gamma function was used to generate a random cost for each Markov state.

These distributions are also reported as means and 95% CI. In this case, the means of the Gamma distributions are sufficiently far from 0 that the approximation to a Gaussian distribution is acceptable to estimate the value of the standard deviation.

The GammaModVar class requires the parameters shape and scale, which relate to mean $\mu$ and variance $\sigma^2$ as $shape = \mu^2 / \sigma^2$ and $scale = \sigma^2 / \mu$.

#| costs-gamma,
#| echo = TRUE

The cost of CAS is also modelled with a Gamma distribution, but in this case the authors estimate the distribution parameters directly:

We assumed that the ratio of its standard deviation over mean was four times as large as that of conventional primary TKR

As a result, the total cost of a surgical procedure is the sum of the surgery cost and the CAS cost, each of which is independently drawn from a defined gamma distribution. This can be easily represented in rdecision using an ExprModVar: a model variable that uses other model variables as operands. The expression needs to be quoted using the rlang::quo command in order to be stored rather than executed straight away.

#| cas-cost-gamma,
#| echo = TRUE

The CAS effect in this model is a reduction in the transition probabilities to state B, and was modelled using a lognormal distribution:

Lognormal function was used to generate a random “effect of CAS.” The variance of the “effect of CAS” was estimated from the clinical trials

Because a lognormally distributed value is always positive, the resulting transition probabilities will be reduced by a non-zero fraction (note however that values > 100% are possible, which would result an invalid negative transition probability).

While it is possible to reconstruct the exact parameters of a lognormal distribution given a mean and 95% CI, the article does not report a range for this variable. Because exact replication is therefore impossible, an arbitrary parametrisation is used here that is unlikely to yield an illegal value.

#| cas-lognorm,
#| echo = TRUE

Markov model

The same structure as the model used for the deterministic solution can be used again, but in this case the costs and utilities will be represented by distributions.

#| SMM-PSA,
#| echo = TRUE

The current model includes a number of model variables, that can be displayed using the modvar_table method. This tabulation can also be used to compare the 95% CI with the values reported in Table 3.

#| purl = FALSE,
#| echo = TRUE
pander::pander(
  SMM_base_PSA$modvar_table()[, c(-2L, -5L, -9L)], justify = "llrrrr",
  round = 5L
)

Note that the list of ModVars associated with the CAS model includes not only the variables that were explicitly assigned to States and Transitions, such as Utility of state A, but also the implicit ModVars underlying ExprModVars, such as Cost of CAS, which is used to calculate Cost of state A with CAS.

#| purl = FALSE,
#| echo = TRUE
pander::pander(
  SMM_CAS_PSA$modvar_table()[, c(-2L, -5L, -9L)], justify = "llrrrr", round = 5L
)

Transition probabilities

To generate a logical multi-Markov state probabilistic transition matrix from the initial point estimates, the Dirichlet distribution was used (11). We estimated a count for each Markov state by the transition probabilities (we assumed that the total counts equalled 1,000). We used random number and Gamma distribution formulae to generate a one-parameter (standard) Gamma distribution for each cell of the transition matrix. The one-parameter Gamma distribution in Excel was obtained by setting the first (alpha) parameter equal to the estimated count and the second (beta) parameter equal to 1. The final step was to “normalize” the realizations from the Gamma distribution back to the 0–1 scale by dividing each realization through by the corresponding row total.

The complex description above is an approximation of a Dirichlet distribution when constrained by the functionalities of Excel. In rdecision, variables drawn from a multinomial probabilistic distribution can be defined using in-built functions:

#| fun-dirichlet,
#| echo = TRUE

PSA runs

The following code executes 250 runs of the model for conventional surgery and for computer-assisted surgery, each of 120 cycles. For each run the set of model variables are sampled from their uncertainty distributions. The Markov traces for each run, in matrix form, are converted to the format used by Dong et al in their Table 4, and added to 3D arrays which store the traces from all the runs, separately for conventional and computer-assisted surgery.

The transition probability matrix is sampled from an assumed matrix of counts using the dirichletify function.

In the case of CAS, there is an additional step in the simulation process due to the fact that the transition probabilities into state B (column 2 in the transition matrix) are modified by a multiplier, Effect of CAS, which is itself drawn from a distribution. In this case, the relevant ModVar needs to be randomised at every cycle, and applied to the transition matrix.

This a "paired" simulation; that is, we are interested in the difference between computer-assisted surgery and conventional surgery, taking account of the variation in inputs. In this model, many variables are common to both models, and we make sure to randomise all variables once, before running each model, every cycle. This is analogous, in statistics, to a paired t-test rather than a two-sample t-test. When there are many common variables, the results of the two models will be correlated.

#| cycle-PSA,
#| echo = TRUE

Results

The means of the cohort simulation results should be very similar, but not necessarily identical, to those in Table 4 (and replicated analytically above). Each run of this script is expected to yield a slightly different result. The addition of probabilistic elements can be used to describe not only the means (which should approximate the deterministic cohort model results), but also the distributions of the estimates, giving the uncertainties associated with the two models.

In the following two sections, the mean of all runs are presented as per Table 4 of Dong & Buxton [-@dong2006] and the uncertainties are presented as per their Table 5, with the addition of the upper and lower confidence limits of the mean.

Conventional surgery

#| purl = FALSE,
#| echo = TRUE
t4_CS_PSA_mean <- apply(t4_CS_PSA, MARGIN = c(1L, 2L), FUN = mean)
pander::pander(
  t4_CS_PSA_mean, round = 2L, row.names = FALSE, big.mark = ",",
  justify = "lrrrrrrr", keep.trailing.zeros = TRUE
)
#| t5-CS-PSA,
#| echo = TRUE
#| purl = FALSE
pander::pander(t5_CS, round = 2L, keep.trailing.zeros = TRUE, justify = "lrrrr")

Computer-assisted surgery

#| purl = FALSE,
#| echo = TRUE
t4_CAS_PSA_mean <- apply(t4_CAS_PSA, MARGIN = c(1L, 2L), FUN = mean)
pander::pander(
  t4_CAS_PSA_mean, round = 2L, row.names = FALSE, big.mark = ",",
  justify = "lrrrrrrr", keep.trailing.zeros = TRUE
)
#| t5-CAS-PSA,
#| echo = TRUE
#| purl = FALSE,
#| echo = TRUE
pander::pander(
  t5_CAS, round = 2L, justify = "lrrrr", keep.trailing.zeros = TRUE
)

Note that the means correlate well with those reported in Table 5, but the assertion that "[CAS] had lower variances for each indicator" is not reliably replicated for every variable. This is likely to be due to the fact that the parameters for the distribution of the Effect of CAS could not be extracted, and the values chosen here may assign excessive uncertainty to this variable.

Cost effectiveness analysis

#| cea-psa,
#| echo = TRUE

From PSA, after 10 years the mean (95% CI) cost change per patient was r round(mean(dcost_psa), digits = 2L) (r round(quantile(dcost_psa, probs = 0.025), digits = 2L) to r round(quantile(dcost_psa, probs = 0.975), digits = 2L)) £, and the mean change in utility was r round(mean(dutil_psa), digits = 2L) (r round(quantile(dutil_psa, probs = 0.025), digits = 2L) to r round(quantile(dutil_psa, probs = 0.975), digits = 2L)) QALY, compared with the deterministic values of -£583 and +0.0148 respectively, as reported in the paper. The incremental cost effectiveness ratio was less than £30,000 per QALY for r round(100 * sum(icer_psa < 30000) / length(icer_psa), 2L)% of runs. The distribution of results on the cost effectiveness plane is shown in the figure below (for comparison with figure 2 of the paper).

#| fig.cap = "Cost-effectiveness plane. QALY = Quality-adjusted life year",
#| purl = FALSE
withr::with_par(
  new = list(mar = c(1L, 2L, 2L, 1L)),
  code = {
    plot(
      dcost_psa ~ dutil_psa,
      pch = 20L,
      xlim = c(-0.1, 0.15), ylim = c(-5000.0, 1000.0),
      axes = FALSE,
      xlab = "", ylab = ""
    )
    axis(side = 1L, pos = 0.0, las = 1L)
    axis(side = 2L, pos = 0.0, las = 1L)
    mtext(
      text = expression(paste(Delta, "QALYs")),
      side = 2L, adj = 5L / 6L
    )
    mtext(
      text = expression(paste(Delta, "Cost (£)")),
      side = 3L, adj = 2L / 5L
    )
  }
)


Try the rdecision package in your browser

Any scripts or data that you put into this service are public.

rdecision documentation built on June 22, 2024, 10:02 a.m.