vignettes/vignette_2_accuracy.md

Vignette: Diagnostic / Prognostic Accuracy

Load data

Firstly let’s re-load our example data (the survival::colon)

data <- tibble::as_tibble(survival::colon) %>%
  dplyr::filter(etype==2) %>% # Outcome of interest is death
  dplyr::filter(rx!="Obs") %>%  # rx will be our binary treatment variable
  dplyr::select(-etype,-study, -status) %>% # Remove superfluous variables

  # Convert into numeric and factor variables
  dplyr::mutate_at(vars(obstruct, perfor, adhere, node4), function(x){factor(x, levels=c(0,1), labels = c("No", "Yes"))}) %>%
  dplyr::mutate(rx = factor(rx),
                mort365 = cut(time, breaks = c(-Inf, 365, Inf), labels = c("Yes", "No")),
                mort365 = factor(mort365, levels = c("No", "Yes")),
                sex = factor(sex, levels=c(0,1), labels = c("Female", "Male")),
                differ = factor(differ, levels = c(1,2,3), labels = c("Well", "Moderate", "Poor")),
                extent = factor(extent, levels = c(1,2,3, 4), labels = c("Submucosa", "Muscle", "Serosa", "Contiguous Structures")),
                surg = factor(surg, levels = c(0,1), labels = c("Short", "Long")))

Now lets create our model fit (fit1) and derive our predictions based on our model (predict1).

We are starting with a very simple model trying to predict deatn at 1 year (“mort365”) using just 1 variable (“rx” aka treatment receieved).

fit1 <- finalfit::glmmulti(data, dependent = "mort365", explanatory = c("rx"))

predict1 <- predictr(data = data, fit = fit1)

Comparing accuracy across a predictive model

Receiver Operating Characteristic (ROC) curves

One of the most common ways to visualise the performance of the model is using Receiver Operating Characteristic (ROC) curves. This involves plotting the sensitivity and specificity of the model against one another.

There are 2 steps required to make these using predictr:

1). roc_plot_format()

This function takes the predictions from the predictr() function output, and provides a tibble of the sensitivity and specificity of the model required for plotting.

roc_format1 <- predict1 %>%
  roc_plot_format(event = "event", predict = "predict_prop",confint = T, smooth=T)

##   |                                                              |                                                      |   0%  |                                                              |                                                      |   1%  |                                                              |=                                                     |   1%  |                                                              |=                                                     |   2%  |                                                              |=                                                     |   3%  |                                                              |==                                                    |   3%  |                                                              |==                                                    |   4%  |                                                              |==                                                    |   5%  |                                                              |===                                                   |   5%  |                                                              |===                                                   |   6%  |                                                              |====                                                  |   6%  |                                                              |====                                                  |   7%  |                                                              |====                                                  |   8%  |                                                              |=====                                                 |   8%  |                                                              |=====                                                 |   9%  |                                                              |=====                                                 |  10%  |                                                              |======                                                |  10%  |                                                              |======                                                |  11%  |                                                              |======                                                |  12%  |                                                              |=======                                               |  12%  |                                                              |=======                                               |  13%  |                                                              |=======                                               |  14%  |                                                              |========                                              |  14%  |                                                              |========                                              |  15%  |                                                              |========                                              |  16%  |                                                              |=========                                             |  16%  |                                                              |=========                                             |  17%  |                                                              |=========                                             |  18%  |                                                              |==========                                            |  18%  |                                                              |==========                                            |  19%  |                                                              |===========                                           |  19%  |                                                              |===========                                           |  20%  |                                                              |===========                                           |  21%  |                                                              |============                                          |  21%  |                                                              |============                                          |  22%  |                                                              |============                                          |  23%  |                                                              |=============                                         |  23%  |                                                              |=============                                         |  24%  |                                                              |=============                                         |  25%  |                                                              |==============                                        |  25%  |                                                              |==============                                        |  26%  |                                                              |==============                                        |  27%  |                                                              |===============                                       |  27%  |                                                              |===============                                       |  28%  |                                                              |===============                                       |  29%  |                                                              |================                                      |  29%  |                                                              |================                                      |  30%  |                                                              |================                                      |  31%  |                                                              |=================                                     |  31%  |                                                              |=================                                     |  32%  |                                                              |==================                                    |  32%  |                                                              |==================                                    |  33%  |                                                              |==================                                    |  34%  |                                                              |===================                                   |  34%  |                                                              |===================                                   |  35%  |                                                              |===================                                   |  36%  |                                                              |====================                                  |  36%  |                                                              |====================                                  |  37%  |                                                              |====================                                  |  38%  |                                                              |=====================                                 |  38%  |                                                              |=====================                                 |  39%  |                                                              |=====================                                 |  40%  |                                                              |======================                                |  40%  |                                                              |======================                                |  41%  |                                                              |======================                                |  42%  |                                                              |=======================                               |  42%  |                                                              |=======================                               |  43%  |                                                              |=======================                               |  44%  |                                                              |========================                              |  44%  |                                                              |========================                              |  45%  |                                                              |=========================                             |  45%  |                                                              |=========================                             |  46%  |                                                              |=========================                             |  47%  |                                                              |==========================                            |  47%  |                                                              |==========================                            |  48%  |                                                              |==========================                            |  49%  |                                                              |===========================                           |  49%  |                                                              |===========================                           |  50%  |                                                              |===========================                           |  51%  |                                                              |============================                          |  51%  |                                                              |============================                          |  52%  |                                                              |============================                          |  53%  |                                                              |=============================                         |  53%  |                                                              |=============================                         |  54%  |                                                              |=============================                         |  55%  |                                                              |==============================                        |  55%  |                                                              |==============================                        |  56%  |                                                              |===============================                       |  56%  |                                                              |===============================                       |  57%  |                                                              |===============================                       |  58%  |                                                              |================================                      |  58%  |                                                              |================================                      |  59%  |                                                              |================================                      |  60%  |                                                              |=================================                     |  60%  |                                                              |=================================                     |  61%  |                                                              |=================================                     |  62%  |                                                              |==================================                    |  62%  |                                                              |==================================                    |  63%  |                                                              |==================================                    |  64%  |                                                              |===================================                   |  64%  |                                                              |===================================                   |  65%  |                                                              |===================================                   |  66%  |                                                              |====================================                  |  66%  |                                                              |====================================                  |  67%  |                                                              |====================================                  |  68%  |                                                              |=====================================                 |  68%  |                                                              |=====================================                 |  69%  |                                                              |======================================                |  69%  |                                                              |======================================                |  70%  |                                                              |======================================                |  71%  |                                                              |=======================================               |  71%  |                                                              |=======================================               |  72%  |                                                              |=======================================               |  73%  |                                                              |========================================              |  73%  |                                                              |========================================              |  74%  |                                                              |========================================              |  75%  |                                                              |=========================================             |  75%  |                                                              |=========================================             |  76%  |                                                              |=========================================             |  77%  |                                                              |==========================================            |  77%  |                                                              |==========================================            |  78%  |                                                              |==========================================            |  79%  |                                                              |===========================================           |  79%  |                                                              |===========================================           |  80%  |                                                              |===========================================           |  81%  |                                                              |============================================          |  81%  |                                                              |============================================          |  82%  |                                                              |=============================================         |  82%  |                                                              |=============================================         |  83%  |                                                              |=============================================         |  84%  |                                                              |==============================================        |  84%  |                                                              |==============================================        |  85%  |                                                              |==============================================        |  86%  |                                                              |===============================================       |  86%  |                                                              |===============================================       |  87%  |                                                              |===============================================       |  88%  |                                                              |================================================      |  88%  |                                                              |================================================      |  89%  |                                                              |================================================      |  90%  |                                                              |=================================================     |  90%  |                                                              |=================================================     |  91%  |                                                              |=================================================     |  92%  |                                                              |==================================================    |  92%  |                                                              |==================================================    |  93%  |                                                              |==================================================    |  94%  |                                                              |===================================================   |  94%  |                                                              |===================================================   |  95%  |                                                              |====================================================  |  95%  |                                                              |====================================================  |  96%  |                                                              |====================================================  |  97%  |                                                              |===================================================== |  97%  |                                                              |===================================================== |  98%  |                                                              |===================================================== |  99%  |                                                              |======================================================|  99%  |                                                              |======================================================| 100%

roc_format1 %>% head(10) %>%   knitr::kable()
spe sen\_est sen\_lci sen\_uci 0 1.0000000 1.0000000 1.0000000 0.01 0.9906903 0.9877996 0.9934219 0.02 0.9813805 0.9755991 0.9868438 0.03 0.9720708 0.9633987 0.9802657 0.04 0.9627611 0.9511983 0.9736877 0.05 0.9534514 0.9389978 0.9671096 0.06 0.9441416 0.9267974 0.9605315 0.07 0.9348319 0.9145969 0.9539534 0.08 0.9255222 0.9023965 0.9473753 0.09 0.9162125 0.8901961 0.9407972

There are two ways you can modify roc_plot_format()

2). roc_plot()

Now we have the outputted dataframe from roc_plot_format() (aka roc_format1), we can then plot this as a ROC curve in ggplot.

roc_format1 %>%
  roc_plot()

Accuracy Metrics

However, we can also quantify this assessment numerically Using the roc_metric() function you can get a variety of the most common metrics used to assess the model discrimination / accuracy.

predict1 %>%
  roc_metric(event  = "event", predict = "predict_prop") %>% knitr::kable()

## Setting levels: control = No, case = Yes
model name estimate lci uci metric 1 Cutoff 0.088 NA NA 0.088 1 AUC 0.518 0.447 0.588 0.518 (95% CI: 0.447 to 0.588) 1 Sensitivity 0.537 0.404 0.670 0.537 (95% CI: 0.404 to 0.670) 1 Specificity 0.498 0.457 0.540 0.498 (95% CI: 0.457 to 0.540) 1 Positive Likelihood Ratio 1.070 0.824 1.389 1.070 (95% CI: 0.824 to 1.389) 1 Negative Likelihood Ratio 0.929 0.689 1.253 0.929 (95% CI: 0.689 to 1.253) 1 Positive Predictive Value (PPV) 0.094 0.061 0.126 0.094 (95% CI: 0.061 to 0.126) 1 Negative Predictive Value (NPV) 0.918 0.887 0.949 0.918 (95% CI: 0.887 to 0.949)

However, if you already have a predefined cut-off for risk classification you may want to test the discrimination of that specific cutoff.

For example, you have predefined that patients with a predicted risk of death >=9% would be considered to be “high-risk” (while those with a predicted risk <9% would be considered “low risk”). We can supply a dichotomised variable instead to the roc_metric() function to get the metrics for this specifically.

predict1 %>%
  dplyr::mutate(binary = ifelse(predict_prop>0.09, "High risk", "Low Risk")) %>%
  roc_metric(event  = "event", predict = "binary") %>% knitr::kable()
name estimate lci uci metric Predicted Prevalence 0.0879479 0.0667588 0.1131952 0.088 (95% CI: 0.067 to 0.113) True Prevalence 0.4951140 0.4548580 0.5354172 0.495 (95% CI: 0.455 to 0.535) Diagnostic Accuracy 0.4983713 0.4580979 0.5386605 0.498 (95% CI: 0.458 to 0.539) Sensitivity 0.0822368 0.0539254 0.1190046 0.082 (95% CI: 0.054 to 0.119) Specificity 0.9064516 0.8684150 0.9364518 0.906 (95% CI: 0.868 to 0.936) Positive Likelihood Ratio 0.8790835 0.5273719 1.4653563 0.879 (95% CI: 0.527 to 1.465) Negative Likelihood Ratio 1.0124789 0.9639635 1.0634361 1.012 (95% CI: 0.964 to 1.063) Positive Predictive Value (PPV) 0.4629630 0.3262246 0.6039050 0.463 (95% CI: 0.326 to 0.604) Negative Predictive Value (NPV) 0.5017857 0.4595658 0.5439867 0.502 (95% CI: 0.460 to 0.544)

Comparing accuracy across multiple models

Now let’s say instead of just 1 model, you have multiple models you want to compare using the same data to determine which one is best in your context.

Let’s generate these models and then use predictr() to predict. These have been combined into one dataset (multiple), with the “model” column informing on which predictions belong to which model.

fit2 <- finalfit::glmmulti(data, dependent = "mort365", explanatory = c("rx", "sex"))
fit3 <- finalfit::glmmulti(data, dependent = "mort365", explanatory = c("rx", "sex","obstruct"))
fit4 <- finalfit::glmmulti(data, dependent = "mort365", explanatory = c("rx", "sex","obstruct", "differ"))

multiple <- list(fit1, fit2, fit3, fit4) %>%
  purrr::map_dfr(function(x){predictr(data = data, fit = x)}, .id="model") %>%
  dplyr::mutate(model = factor(model, levels = c(1:4),
                               labels = c("Rx", "Rx & Sex", "Rx, Sex, & Obstruct", "Rx, Sex, Obstruct, & Differ")))

multiple %>% head(10) %>% knitr::kable()
model rowid mort365 rx event predict\_raw predict\_prop sex obstruct differ Rx 1 No Lev+5FU No -2.412336 0.0822368 NA NA NA Rx 2 No Lev+5FU No -2.412336 0.0822368 NA NA NA Rx 3 Yes Lev+5FU Yes -2.412336 0.0822368 NA NA NA Rx 4 No Lev+5FU No -2.412336 0.0822368 NA NA NA Rx 5 No Lev No -2.271059 0.0935484 NA NA NA Rx 6 No Lev No -2.271059 0.0935484 NA NA NA Rx 7 No Lev+5FU No -2.412336 0.0822368 NA NA NA Rx 8 No Lev No -2.271059 0.0935484 NA NA NA Rx 9 No Lev+5FU No -2.412336 0.0822368 NA NA NA Rx 10 No Lev No -2.271059 0.0935484 NA NA NA

Now we’re set up for comparing the accuracy across all the models. This is handled automatically by predictr if we tell it which column (“model”) has the information on which model the prediction is from.

multiple %>%
  roc_metric(model = "model", event  = "event", predict = "predict_prop") %>%
  tidyr::pivot_wider(id_cols = c("name"), names_from = "model", values_from = "metric") %>%
  knitr::kable()

## Setting levels: control = No, case = Yes
## Setting levels: control = No, case = Yes
## Setting levels: control = No, case = Yes
## Setting levels: control = No, case = Yes
name Rx Rx & Sex Rx, Sex, & Obstruct Rx, Sex, Obstruct, & Differ Cutoff 0.088 0.082 0.081 0.090 AUC 0.518 (95% CI: 0.447 to 0.588) 0.518 (95% CI: 0.445 to 0.591) 0.583 (95% CI: 0.501 to 0.664) 0.669 (95% CI: 0.587 to 0.751) Sensitivity 0.537 (95% CI: 0.404 to 0.670) 0.833 (95% CI: 0.734 to 0.933) 0.574 (95% CI: 0.442 to 0.706) 0.574 (95% CI: 0.442 to 0.706) Specificity 0.498 (95% CI: 0.457 to 0.540) 0.236 (95% CI: 0.201 to 0.271) 0.591 (95% CI: 0.550 to 0.632) 0.728 (95% CI: 0.691 to 0.765) Positive Likelihood Ratio 1.070 (95% CI: 0.824 to 1.389) 1.090 (95% CI: 0.959 to 1.239) 1.404 (95% CI: 1.093 to 1.803) 2.110 (95% CI: 1.614 to 2.758) Negative Likelihood Ratio 0.929 (95% CI: 0.689 to 1.253) 0.707 (95% CI: 0.382 to 1.308) 0.721 (95% CI: 0.525 to 0.990) 0.585 (95% CI: 0.427 to 0.801) Positive Predictive Value (PPV) 0.094 (95% CI: 0.061 to 0.126) 0.095 (95% CI: 0.069 to 0.122) 0.119 (95% CI: 0.080 to 0.159) 0.173 (95% CI: 0.118 to 0.229) Negative Predictive Value (NPV) 0.918 (95% CI: 0.887 to 0.949) 0.936 (95% CI: 0.896 to 0.977) 0.935 (95% CI: 0.909 to 0.961) 0.945 (95% CI: 0.923 to 0.967)
 multiple %>%
  roc_plot_format(model = "model", event = "event", predict = "predict_prop",confint = F, smooth = F) %>%
  roc_plot()

From above, we can see that as the number of variables added increases, there is improved accuracy of the model for predicting death at 1 year.



kamclean/predictr documentation built on Aug. 14, 2022, 4:35 a.m.