# create_dataList_MRMC: Creates a _Single_ Dataset in Case of MRMC In BayesianFROC: FROC Analysis by Bayesian Approaches

## Description

From a given model parameter, creates a FROC dataset in case of multiple readers and multiple modality, breafly MRMC. The dataset consists of the number of hits and false alarms and ID vectors of readers, modalites, confidences, etc.

The created dataset is a list (which can be passed to fit_Bayesian_FROC()). Model parameters are thresholds, mean and standard deviation of signal Gaussian.

## Usage

  1 2 3 4 5 6 7 8 9 10 create_dataList_MRMC( z.truth = BayesianFROC::z_truth, mu.truth = BayesianFROC::mu_truth, v.truth = BayesianFROC::v_truth, NI = 57, NL = 142, ModifiedPoisson = FALSE, seed = 123, summary = FALSE ) 

## Arguments

 z.truth Vector ( of dimension C) represents the thresholds. mu.truth array of dimension (M,Q). Mean of the signal distribution of bi-normal assumption. v.truth array of dimension (M,Q). Standard Deviation of represents the signal distribution of bi-normal assumption. NI The number of images, NL The number of lesions, ModifiedPoisson Logical, that is TRUE or FALSE. If ModifiedPoisson = TRUE, then Poisson rate of false alarm is calculated per lesion, and a model is fitted so that the FROC curve is an expected curve of points consisting of the pairs of TPF per lesion and FPF per lesion. Similarly, If ModifiedPoisson = TRUE, then Poisson rate of false alarm is calculated per image, and a model is fitted so that the FROC curve is an expected curve of points consisting of the pair of TPF per lesion and FPF per image. For more details, see the author's paper in which I explained per image and per lesion. (for details of models, see vignettes , now, it is omiited from this package, because the size of vignettes are large.) If ModifiedPoisson = TRUE, then the False Positive Fraction (FPF) is defined as follows (F_c denotes the number of false alarms with confidence level c ) \frac{F_1+F_2+F_3+F_4+F_5}{N_L}, \frac{F_2+F_3+F_4+F_5}{N_L}, \frac{F_3+F_4+F_5}{N_L}, \frac{F_4+F_5}{N_L}, \frac{F_5}{N_L}, where N_L is a number of lesions (signal). To emphasize its denominator N_L, we also call it the False Positive Fraction (FPF) per lesion. On the other hand, if ModifiedPoisson = FALSE (Default), then False Positive Fraction (FPF) is given by \frac{F_1+F_2+F_3+F_4+F_5}{N_I}, \frac{F_2+F_3+F_4+F_5}{N_I}, \frac{F_3+F_4+F_5}{N_I}, \frac{F_4+F_5}{N_I}, \frac{F_5}{N_I}, where N_I is the number of images (trial). To emphasize its denominator N_I, we also call it the False Positive Fraction (FPF) per image. The model is fitted so that the estimated FROC curve can be ragraded as the expected pairs of FPF per image and TPF per lesion (ModifiedPoisson = FALSE ) or as the expected pairs of FPF per image and TPF per lesion (ModifiedPoisson = TRUE) If ModifiedPoisson = TRUE, then FROC curve means the expected pair of FPF per lesion and TPF. On the other hand, if ModifiedPoisson = FALSE, then FROC curve means the expected pair of FPF per image and TPF. So,data of FPF and TPF are changed thus, a fitted model is also changed whether ModifiedPoisson = TRUE or FALSE. In traditional FROC analysis, it uses only per images (trial). Since we can divide one image into two images or more images, number of trial is not important. And more important is per signal. So, the author also developed FROC theory to consider FROC analysis under per signal. One can see that the FROC curve is rigid with respect to change of a number of images, so, it does not matter whether ModifiedPoisson = TRUE or FALSE. This rigidity of curves means that the number of images is redundant parameter for the FROC trial and thus the author try to exclude it. Revised 2019 Dec 8 Revised 2019 Nov 25 Revised 2019 August 28 seed The seed for creating hits which are synthesized by the binomial distributions with the specified seed. summary Logical: TRUE of FALSE. Whether to print the verbose summary. If TRUE then verbose summary is printed in the R console. If FALSE, the output is minimal. I regret, this variable name should be verbose.

## Details

Specifying model parameters, we can replicates fake datasets. Different seed gives different fake data. Model parameters are the following.

z.truth

mu.truth

v.truth.

Probablity law of hits Random variables of hits are distributed as follows.

H_{5,m,r} \sim Binomial (p_{5,m,r}(θ), N_L ),

then H_{4,m,r} should be drawn from the binomial distribution with remaining targets

H_{4,m,r} \sim Binomial (\frac{p_{4,m,r}(θ)}{1-p_{5,m,r}(θ)}, N_L - H_{5,m,r}).

Similarly, because we already found H_{4,m,r} + H_{5,m,r} targets, the remained targets are N_L - H_{5,m,r} -H_{4,m,r}. Thus it natural to assume the following. Note that the hit rate is defined so that the resulting model satisfy certain equations which is not explained here.

H_{3,m,r} \sim Binomial (\frac{p_{3,m,r}(θ)}{1-p_{5,m,r}(θ)-p_{4,m,r}(θ)}, N_L - H_{5,m,r} -H_{4,m,r}).

H_{2,m,r} \sim Binomial (\frac{p_{2,m,r}(θ)}{1-p_{5,m,r}(θ)-p_{4,m,r}(θ)-p_{3,m,r}(θ)}, N_L - H_{5,m,r} -H_{4,m,r}-H_{3,m,r}).

H_{1,m,r} \sim Binomial (\frac{p_{1,m,r}(θ)}{1-p_{5,m,r}(θ)-p_{4,m,r}(θ)-p_{3,m,r}(θ)-p_{2,m,r}(θ)}, N_L - H_{5,m,r} -H_{4,m,r}-H_{3,m,r}-H_{2,m,r}).

Probablity law of false alarms

F_{5,m,r} \sim Poisson(q_{5,m,r}(θ) N_X ),

F_{4,m,r} \sim Poisson( q_{4,m,r}(θ) N_X ),

F_{3,m,r} \sim Poisson( q_{3,m,r}(θ) N_X ),

F_{2,m,r} \sim Poisson( q_{2,m,r}(θ) N_X ),

F_{1,m,r} \sim Poisson( q_{1,m,r}(θ) N_X ),

where subscripts m,r mean the m-th modality and the r-th reader, respectively. Note that N_X is the following two cases.

1) N_X = N_L (The number of lesions), if  ModifiedPoisson = TRUE.

2) N_X = N_I (The number of images), if  ModifiedPoisson = FALSE.

We fix the N_X = N_L or N_X = N_I through out this paper.

The rate p_{c,m,r}(θ) and q_{c,m,r}(θ) are calculated from the model parameter θ.

In the R code, the model parameter θ is denoted by

z.truth

mu.truth

v.truth.

Specifying these model parameters we can make a fake dataset consisting of hit data H_{c,m,r} false alarm data F_{c,m,r} for each c,m,r.

chi_square_at_replicated_data_and_MCMC_samples_MRMC() replicate_MRMC_dataList() (To make many MRMC datasets, see replicate_MRMC_dataList())
  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 ## Not run: dataList <- create_dataList_MRMC() fit_Bayesian_FROC(dataList, summary = FALSE, ite = 1111) # In the above example, we use a default values for true parameters for # the distributions. The reason why the default values exists is difficulty # for the user who is not familiar with FROC data nor konws the resions # in which parameters of FROC model move. # So, in the Bayesian model is merely model for FROC data. # If user input the abnormal data, then the model does not fit nor converge # in the Hamiltonian Monte Carlo simulations. plot_FPF_and_TPF_from_a_dataset(create_dataList_MRMC() ) #======================================================================================== # plot various MRMC datasets with fixed signal distribution but change thresholds #======================================================================================== plot_FPF_and_TPF_from_a_dataset(create_dataList_MRMC( z.truth = c(0.1, 0.2, 0.3, 0.4) )) plot_FPF_and_TPF_from_a_dataset(create_dataList_MRMC( z.truth = c(-0.1, 0.2, 0.3, 0.4) )) plot_FPF_and_TPF_from_a_dataset(create_dataList_MRMC( z.truth = c(-1, 0.2, 0.3, 0.4) )) plot_FPF_and_TPF_from_a_dataset(create_dataList_MRMC( z.truth = c(-1, -0.2, -0.3, 0.4) )) plot_FPF_and_TPF_from_a_dataset(create_dataList_MRMC( z.truth = c(-1, 0.2, 0.3 ) )) plot_FPF_and_TPF_from_a_dataset(create_dataList_MRMC( z.truth = c(-1, 1.2, 2.3 ) )) plot_FPF_and_TPF_from_a_dataset(create_dataList_MRMC( z.truth = c(-1, -0.5, 0, 1.2, 2.3, 3.3, 4) )) plot_FPF_and_TPF_from_a_dataset(create_dataList_MRMC( z.truth = c(-1, -0.5, 0, 1.2, 2.3, 3.3, 4, 5, 6) )) plot_FPF_and_TPF_from_a_dataset(create_dataList_MRMC( z.truth = c(-1, -0.5, 0, 1.2, 2.3, 3.3, 4, 5, 6, 7) )) plot_FPF_and_TPF_from_a_dataset(create_dataList_MRMC( z.truth = c(-1, -0.5, 0, 1.2, 2.3, 3.3, 4, 5, 6, 7, 8, 9, 10) )) #======================================================================================== # Smoothing of Scatter Plot for FPF and TPF #======================================================================================== v <- v_truth_creator_for_many_readers_MRMC_data(M=1,Q=17) m <- mu_truth_creator_for_many_readers_MRMC_data(M=1,Q=17) d <- create_dataList_MRMC(mu.truth = m,v.truth = v) d<-metadata_to_fit_MRMC(d) df <- data.frame(FPF = d$ffN, TPF = d$hhN) # require(graphics) dark_theme() graphics::plot(df, main = "lowess(cars)") graphics::lines(stats::lowess(df), col = 2) graphics::lines(stats::lowess(df, f = .2), col = 3) graphics::legend(5, 120, c(paste("f = ", c("2/3", ".2"))), lty = 1, col = 2:3) ## End(Not run)