In maxwestphal/DTAmc: Diagnostic Test Accuracy studies with multiple comparisons

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

library(DTAmc)

The goal of is this vignette is to illustrate the R package 'DTAmc' by code examples which can be re-run and/or modified.

NOTE: THIS DOCUMENT IS IN AN EARLY STATE. EXPECT SOME BUGS!

Motivating example

# real data example from publication here
123

Important functions

categorize()

Often, binary predictions are not readily available but rather need to be derived from continuous (risk) scores. This can be done via the categorize function.

# real data example from publication here
set.seed(123)
M <- as.data.frame(mvtnorm::rmvnorm(10, mean=rep(0, 3), sigma=2*diag(3)))
M

## categorize at 0 by default
yhat <- categorize(M)
yhat

## define multiple cutpoints to define multiple decision rules per marker
C <- c(0, 1, 0, 1, 0, 1)
a <- c(1, 1, 2, 2, 3, 3)
categorize(M, C, a)


## this can even be used to do multi-class classification, like this:
C <- matrix(rep(c(-1, 0, 1, -2, 0, 2), 3), ncol=3, byrow = TRUE)
C
categorize(M, C, a)

compare()

In supervised classification, it is assumed that we have a true set of labels. In medical testing, this is usually called the reference standard provided by an established diagnostic/prognostic tool. We need to compare model predictions against these labels in order to compute model accuracy.

## consider binary prediction from 3 models from previous r chunk
names(yhat) <- paste0("rule", 1:ncol(yhat))
yhat

## assume true labels
y <- c(rep(1, 5), rep(0, 5))

## compare then results in 
compare(yhat, y)

study_dta()

Main function of the package

study_dta(compare(yhat, y))

More details on the dta function are provided in the last section

generate_data()

DTAmc includes a few functions for synthetic data generation

generate_data_lfc(n=20)

generate_data_roc(n=20)

Remark: Synthetic data comes at the 'compared' level meaning the labels 1 and 0 indicate correct and false predictions, respectively. No need to compare() in addition.

Common workflows

The pipe operator '%>%' allows us to chain together subsequent operations in R. This is useful, as the dta function expects preprocessed data indicating correct (1) and false (0) predictions.

M %>%
  categorize() %>%
  compare(y) %>%
  study_dta()

Multiple testing for co-primary endpoints

Specification of hypotheses

The R command

?study_dta

gives an overview over the function arguments of the dta function.

comparator defines one of the classification rules under consideration to be the primary comparator
benchmark is a pre-defined accuracy categorize for each subgroup

Together this implies the hypotheses system that is considered, namely

$H_0: \forall g \forall j: \theta_j^g \leq \theta_0^g$

In the application of primary interest, diagnostic accuracy studies, this simplifies to $G=2$ with $\theta_1 = Se$ and $\theta_2 =Sp$ indicating sensitivity and specificity of a medical test or classication rule. In this case we aim to reject the global null hypothesis

$H_0: \forall j: Se_j \leq Se_0 \wedge Sp_j \leq Sp_0$