# fit_a_model_to: Fit a model to data In BayesianFROC: FROC Analysis by Bayesian Approaches

## Description

Fit a model to data.

## Usage

 1 2 3 4 5 6 7 fit_a_model_to( dataList, number_of_parallel_chains_for_MCMC = 1, number_of_iterations_for_MCMC = 1111, seed_for_MCMC = 1234, ... ) 

## Arguments

dataList

A list, specifying an FROC data to be fitted a model. It consists of data of numbers of TPs, FPs, lesions, images. .In addition, if in case of mutiple readers or mutiple modalities, then modaity ID and reader ID are included also.

The dataList will be passed to the function rstan::sampling() of rstan. This is a variable in the function rstan::sampling() in which it is named data.

For the single reader and a single modality data, the dataList is made by the following manner:

 dataList.Example <- list(

 h = c(41,22,14,8,1), # number of hits for each confidence level

 f = c(1,2,5,11,13), # number of false alarms for each confidence level

 NL = 124, # number of lesions (signals)

 NI = 63, # number of images (trials)

 C = 5) # number of confidence, .. the author thinks it can be calculated as the length of h or f ...? ha, why I included this. ha .. should be omitted.

Using this object dataList.Example, we can apply fit_Bayesian_FROC() such as fit_Bayesian_FROC(dataList.Example).

To make this R object dataList representing FROC data, this package provides three functions:

 convertFromJafroc()

If data is a JAFROC xlsx formulation.

 dataset_creator_new_version()

Enter TP and FP data by table .

 create_dataset()

Enter TP and FP data by interactive manner.

Before fitting a model, we can confirm our dataset is correctly formulated by using the function  viewdata().

—————————————————————————————-

A Single reader and a single modality (SRSC) case.

—————————————————————————————-

In a single reader and a single modality case (srsc), dataList is a list consisting of f, h, NL, NI, C where f, h are numeric vectors and NL, NI, C are positive integers.

f

Non-negative integer vector specifying number of false alarms associated with each confidence level. The first component corresponding to the highest confidence level.

h

Non-negative integer vector specifying number of Hits associated with each confidence level. The first component corresponding to the highest confidence level.

NL

A positive integer, representing Number of Lesions.

NI

A positive integer, representing Number of Images.

C

A positive integer, representing Number of Confidence level.

The detail of these dataset, see the datasets endowed with this package. 'Note that the maximal number of confidence level, denoted by C, are included, however, Note that confidence level vector c  should not be specified. If specified, will be ignored , since it is created by  c <-c(rep(C:1)) in the inner program and do not refer from user input data, where C is the highest number of confidence levels. So, you should write down your hits and false alarms vector so that it is compatible with this automatically created c vector.

data Format:

A single reader and a single modality case

——————————————————————————————————

 NI=63,NL=124 confidence level No. of false alarms No. of hits In R console ->  c f  h ----------------------- ----------------------- ----------------------------- ------------- definitely present c[1] = 5 f[1] = F_5 = 1 h[1] = H_5 = 41 probably present c[2] = 4 f[2] = F_4 = 2 h[2] = H_4 = 22 equivocal c[3] = 3 f[3] = F_3 = 5 h[3] = H_3 = 14 subtle c[4] = 2 f[4] = F_2 = 11 h[4] = H_2 = 8 very subtle c[5] = 1 f[5] = F_1 = 13 h[5] = H_1 = 1

—————————————————————————————————

* false alarms = False Positives = FP

* hits = True Positives = TP

Note that in FROC data, all confidence level means present (diseased, lesion) case only, no confidence level indicating absent. Since each reader marks his suspicious location only if he thinks lesions are present, and marked positions generates the hits or false alarms, thus each confidence level represents that lesion is present. In the absent case, reader does not mark any locations and hence, the absent confidence level does not relate this dataset. So, if reader think it is no lesion, then in such case confidence level is not needed.

Note that the first column of confidence level vector c  should not be specified. If specified, will be ignored , since it is created by  c <-c(rep(C:1)) automatically in the inner program and do not refer from user input data even if it is specified explicitly, where C is the highest number of confidence levels. So you should check the compatibility of your data and the confidence level vector  c <-c(rep(C:1)) via a table which can be displayed by the function viewdata().

—————————————————————————————

Multiple readers and multiple modalities case, i.e., MRMC case

—————————————————————————————

In case of multiple readers and multiple modalities, i.e., MRMC case, in order to apply the function fit_Bayesian_FROC(), dataset represented by an R list object representing FROC data must contain components m,q,c,h,f,NL,C,M,Q.

C

A positive integer, representing the highest number of confidence level, this is a scalar.

M

A positive integer vector, representing the number of modalities.

Q

A positive integer, representing the number of readers.

m

A vector of positive integers, representing the modality ID vector.

q

A vector of positive integers, representing the reader ID vector.

c

A vector of positive integers, representing the confidence level. This vector must be made by rep(rep(C:1), M*Q)

h

A vector of non-negative integers, representing the number of hits.

f

A vector of non-negative integers, representing the number of false alarms.

NL

A positive integer, representing the Total number of lesions for all images, this is a scalar.

Note that the maximal number of confidence level (denoted by C) are included in the above R object. However, each confidence level vector is not included in the data, because it is created automatically from C. To confirm false positives and hits are correctly ordered with respect to the automatically generated confidence vector,

the function viewdata() shows the table. Revised 2019 Nov 27 Revised 2019 Dec 5

Example data.

Multiple readers and multiple modalities ( i.e., MRMC)

—————————————————————————————————

 Modality ID Reader ID Confidence levels No. of false alarms No. of hits. m  q c  f  h -------------- ------------- ------------------------ ------------------- ---------------- 1 1 3 20 111 1 1 2 29 55 1 1 1 21 22 1 2 3 6 100 1 2 2 15 44 1 2 1 22 11 2 1 3 6 66 2 1 2 24 55 2 1 1 23 1 2 2 3 5 66 2 2 2 30 55 2 2 1 40 44

—————————————————————————————————

* false alarms = False Positives = FP

* hits = True Positives = TP

number_of_parallel_chains_for_MCMC

A positive integer, indicating the number of chains for MCMC. To be passed to the function rstan::sampling() of rstan.

number_of_iterations_for_MCMC

A positive integer, indicating the number of interations for MCMC. To be passed to the function rstan::sampling() of rstan.

seed_for_MCMC

A positive integer, indicating the seed for MCMC. To be passed to the function rstan::sampling() of rstan.

...

## Details

FROC data to be fitted a model

The following table is a dataset to be fitted a model.

——————————————————————————————————

 confidence level No. of false alarms No. of hits (FP:False Positive) (TP:True Positive) ----------------------- ----------------------- ----------------------------- ------------- definitely present 5 F_5 H_5 probably present 4 F_4 H_4 equivocal 3 F_3 H_3 subtle 2 F_2 H_2 very subtle 1 F_1 H_1

—————————————————————————————————

Define

p_c(θ):= \int ^{θ_{c+1}}_{θ_c} Gaussian(z|μ, σ) dz,

q_c(θ):= \int ^{θ_{c+1}}_{θ_c} \frac{d}{dz} \log Φ(z) dz.

Note that θ_0 := - ∞.

We extend the vector from (H_c)_{c=1,2,...,C} to (H_c)_{c=0,1,2,...,C}, where H_0:= N_L - (H_1+H_2+...+H_C).

Then, we assume

(H_c)_{c=0,1,2,...,C} \sim Multinomial((p_c)_{c=0,1,2,...,C} )

and

F_c \sim Poisson(q_c(θ)N_I ).

Recall that N_I denotes the number of images (radiographs, such as X-ray films) and N_L the number of lesions (signals, nodules,).

fit_Bayesian_FROC() which has very redundant variables. So, fit_a_model_to() is made by simplifying fit_Bayesian_FROC() so that its variables is minimum. To access full details, see the help of fit_Bayesian_FROC().

This function aims to give a simple interface by ignoring unnecessarly parameters of fit_Bayesian_FROC().

## Value

An fitted model object of the S4 class named stanfitExtended which is an inherited class from stanfit.

fit_Bayesian_FROC()
  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 ## Not run: #======================================================================================== # 1) Build a data-set #======================================================================================== # For a single reader and a single modality case. data <- list(c=c(3,2,1), # Confidence level. Note that c is ignored. h=c(97,32,31), # Number of hits for each confidence level f=c(1,14,74), # Number of false alarms for each confidence level NL=259, # Number of lesions NI=57, # Number of images C=3) # Number of confidence level viewdata(data) # where, # c denotes confidence level, i.e., rating of reader. # 3 = Definitely diseased, # 2 = subtle,.. diseased # 1 = very subtle # h denotes number of hits (True Positives: TP) for each confidence level, # f denotes number of false alarms (False Positives: FP) for each confidence level, # NL denotes number of lesions, # NI denotes number of images, # For example, in the above example data, # the number of hits with confidence level 3 is 97, # the number of hits with confidence level 2 is 32, # the number of hits with confidence level 1 is 31, # the number of false alarms with confidence level 3 is 1, # the number of false alarms with confidence level 2 is 14, # the number of false alarms with confidence level 1 is 74, #======================================================================================== # 2) Fit an FROC model to the above dataset. #======================================================================================== fit <- BayesianFROC::fit_a_model_to( # Dataset to be fiited dataList = data, # To run in time <5s, MCMC iterations too small to obtain reliable estimates number_of_iterations_for_MCMC = 1111, # The number of chains, it is better if larger. number_of_parallel_chains_for_MCMC = 1 ) #======================================================================================== # fit a FROC model using multinomial distribution #======================================================================================== # The Chakraborty's model is fitted to data named "d" fit <- fit_Bayesian_FROC( multinomial = TRUE, # <--- here, the model of multinomial is declared ite = 1111, cha = 1, summary = TRUE, dataList = d # Example data to be fitted a model ) ## End(Not run)#dontrun