chi_square_goodness_of_fit_from_input_all_param: Calculates the Goodness of Fit (Chi Square)

Description Usage Arguments Details Value Examples

View source: R/chi_square_goodness_of_fit.R

Description

('; ω;') The so-called chi square goodness of fit is a function of data-set y and model parameter θ, namely, χ(y|θ). This function merely provides this. Detail. But when the author reviews this today, I am surprised cuz this function depends on many variables and it will be hard to understand what it is. OK, I will enjoy to tell the audiences what the variables mean. First of all, what we should consider is only substitution of dataset y and model parameter θ into χ(y|θ). y is decomposed into h,f,NI,NL which mean the number of hits, false alarms, images and trials. θ corresponds to p, lambda. Holy moly, I write this without any tips, lemonades and coffee! I love you. Today 2020 Oct 19, MCS symptoms is basically not bad, but, still aches in muscles, legs, why? for 3 years, too long to be patient.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
chi_square_goodness_of_fit_from_input_all_param(
  h,
  f,
  p,
  lambda,
  NL,
  NI,
  ModifiedPoisson = FALSE,
  dig = 3,
  is_print_each_ratings_wise = FALSE
)

Arguments

h

A vector of non-negative integers, indicating the number of hits. The reason why the author includes this variable is to substitute the false alarms from the posterior predictive distribution. In famous Gelman's book, we can access how to make test statistics in the Bayesian context, and it require the samples from posterior predictive distribution. So, using this variable author substitute the replication data from the posterior predictive distributions.

f

A vector of non-negative integers, indicating the number of false alarms. The reason why the author includes this variable is to substitute the false alarms from the posterior predictive distribution. In famous Gelman's book, he explain how to make test statistics in the Bayesian context, and it require the samples from posterior predictive distribution. So, in this variable author substitute the replication data from the posterior predictive distributions.

p

A vector of non-negative integers, indicating hit rate. A vector whose length is number of confidence levels.

lambda

A vector of non-negative integers, indicating False alarm rate. A vector whose length is number of confidence levels.

NL

An integer, representing Number of Lesions

NI

An integer, representing Number of Images

ModifiedPoisson

Logical, that is TRUE or FALSE.

If ModifiedPoisson = TRUE, then Poisson rate of false alarm is calculated per lesion, and a model is fitted so that the FROC curve is an expected curve of points consisting of the pairs of TPF per lesion and FPF per lesion.

Similarly,

If ModifiedPoisson = TRUE, then Poisson rate of false alarm is calculated per image, and a model is fitted so that the FROC curve is an expected curve of points consisting of the pair of TPF per lesion and FPF per image.

For more details, see the author's paper in which I explained per image and per lesion. (for details of models, see vignettes , now, it is omiited from this package, because the size of vignettes are large.)

If ModifiedPoisson = TRUE, then the False Positive Fraction (FPF) is defined as follows (F_c denotes the number of false alarms with confidence level c )

\frac{F_1+F_2+F_3+F_4+F_5}{N_L},

\frac{F_2+F_3+F_4+F_5}{N_L},

\frac{F_3+F_4+F_5}{N_L},

\frac{F_4+F_5}{N_L},

\frac{F_5}{N_L},

where N_L is a number of lesions (signal). To emphasize its denominator N_L, we also call it the False Positive Fraction (FPF) per lesion.

On the other hand,

if ModifiedPoisson = FALSE (Default), then False Positive Fraction (FPF) is given by

\frac{F_1+F_2+F_3+F_4+F_5}{N_I},

\frac{F_2+F_3+F_4+F_5}{N_I},

\frac{F_3+F_4+F_5}{N_I},

\frac{F_4+F_5}{N_I},

\frac{F_5}{N_I},

where N_I is the number of images (trial). To emphasize its denominator N_I, we also call it the False Positive Fraction (FPF) per image.

The model is fitted so that the estimated FROC curve can be ragraded as the expected pairs of FPF per image and TPF per lesion (ModifiedPoisson = FALSE )

or as the expected pairs of FPF per image and TPF per lesion (ModifiedPoisson = TRUE)

If ModifiedPoisson = TRUE, then FROC curve means the expected pair of FPF per lesion and TPF.

On the other hand, if ModifiedPoisson = FALSE, then FROC curve means the expected pair of FPF per image and TPF.

So,data of FPF and TPF are changed thus, a fitted model is also changed whether ModifiedPoisson = TRUE or FALSE. In traditional FROC analysis, it uses only per images (trial). Since we can divide one image into two images or more images, number of trial is not important. And more important is per signal. So, the author also developed FROC theory to consider FROC analysis under per signal. One can see that the FROC curve is rigid with respect to change of a number of images, so, it does not matter whether ModifiedPoisson = TRUE or FALSE. This rigidity of curves means that the number of images is redundant parameter for the FROC trial and thus the author try to exclude it.

Revised 2019 Dec 8 Revised 2019 Nov 25 Revised 2019 August 28

dig

A variable to be passed to the function rstan::sampling() of rstan in which it is named ...??. A positive integer representing the Significant digits, used in stan Cancellation. Default = 5,

is_print_each_ratings_wise

A logical, whether result is printed on the R/R-studio console.

Details

statistics for each MCMC sample with a fixed dataset.

Our data is 2C categories, that is,

the number of hits :h[1], h[2], h[3],...,h[C] and

the number of false alarms: f[1],f[2], f[3],...,f[C].

Our model has C+2 parameters, that is,

the thresholds of the bi normal assumption z[1],z[2],z[3],...,z[C] and

the mean and standard deviation of the signal distribution.

So, the degree of freedom of this statistics is calculated by

2C-(C+2)-1 =C -3.

This differ from Chakraborty's result C-2. Why ?

Value

A number! Not list nor data-frame nor vector! Only A number represent the chi square for your input data.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
## Not run: 

#  Makes a stanfit object (more precisely its inherited S4 class object)

       fit <- fit_Bayesian_FROC(BayesianFROC::dataList.Chakra.1,
                           ite = 1111,
                           summary =FALSE,
                           cha = 2)

#   Calculates the chi square discrepancies (Goodness of Fit)
#   with the posterior mean as a parameter.


  NI          <-  fit@dataList$NI
  NL          <-  fit@dataList$NL
  f.observed  <-  fit@dataList$f
  h.observed  <-  fit@dataList$h
  C           <-  fit@dataList$C

#      p <-  rstan::get_posterior_mean(fit, par=c("p"))
# lambda <- rstan::get_posterior_mean(fit, par=c("l"))
# Note that get_posterior_mean is not a number but a matrix when
# Chains is not 1.
# So, instead of it, we use
#

  e     <- extract_EAP_CI(fit,"l",fit@dataList$C )
 lambda <- e$l.EAP

  e <- extract_EAP_CI(fit,"p",fit@dataList$C )
  p <- e$p.EAP

         Chi.Square <-   chi_square_goodness_of_fit_from_input_all_param(

                          h   =   h.observed,
                          f   =   f.observed,
                          p   =   p,
                      lambda  =   lambda,
                          NL  =   NL,
                          NI  =   NI
                               )

#  Get posterior mean of the chi square discrepancy.

                    Chi.Square

# Calculate the p-value for the posterior mean of the chi square discrepancy.

                     stats::pchisq(Chi.Square,df=1)








## End(Not run)# dottest

BayesianFROC documentation built on Jan. 23, 2022, 9:06 a.m.