R/ppp.R In BayesianFROC: FROC Analysis by Bayesian Approaches

Documented in pppppp_MRMCppp_srsc

#' @title MRMC or srsc: Posterior
#' Predictive P value (PPP) for MRMC or srsc.
#' @description PPP for chi square
#'  goodness of fit statistic.
#'
#' @details I hate the notion
#' of p value and this is the
#'  motivation that I developed
#'   new FROC theory.
#' However, I cannot overcome
#' I hate statistics since p value
#'   is  bitch, monotonically decreases
#'    when the sample size is large.
#'    In some papers, I forget the name, but in some papers,
#'    one pointed out that the frequentist p values precisely coincides
#'    some poseterior probability of some event (I forget this but such as mean1 is greater than mean2).
#'
#'
#'  In some suitable condition, I
#'  conjecture that Bayesian p value
#'    coincides to frequentist p value
#'     in some sense  such as analytically
#'     or its expectation of a posterior
#'     or etc or large MCMC samples
#'  So, p value is bitch and bitch and bitch.
#' I emphasize that notion of p value is
#'  bitch and its background is unknown.
#' In suitable condition, frequentist
#' p value bitch is equal to a probability
#'  of some event measured by posterior.
#' So,... Bayesian method cannot break
#'   Bayesian and frequentist are all bitch!!
#' Of course, intuitively, it is good.
#'  But, the theoretically,
#'  it does not satisfies naturalist.
#'
#'
#'
#'
#'
#'
#'
#'
#'
#'@inheritParams fit_Bayesian_FROC
#'@inheritParams DrawCurves
#'@inheritParams draw_latent_signal_distribution
#'@param plot Logical, whether
#'replicated datasets are drawn.
#' @return A positive number between zero and one,
#' indicating Posterior
#' Predictive P value (PPP). In addition, it plots replicated
#'  datasets which are used to calculate a ppp.
#' @export
#'
#' @examples
#'
#'
#'
#' \dontrun{
# ####1#### ####2#### ####3#### ####4#### ####5#### ####6#### ####7#### ####8#### ####9####
#'
#'
#'
#'#   The 1-st example: MRMC data
#'#========================================================================================
#'#                        1)  Fit a Model to MRMC Data
#'#========================================================================================
#'
#'
#'
#'
#'                fit <- fit_Bayesian_FROC( ite  = 33,  dataList = ddd )
#'
#'
#'
#'
#'
#'#========================================================================================
#'#  2)  Evaluate Posterior Predictive P value for the Goodness of Fit
#'#========================================================================================
#'
#'
#'
#'
#'
#'
#'                                 ppp(fit)
#'
#'
#'
#'
#'
#'#  If this quantity, namely a p value, is greater,
#'#  then we may say that our goodness of fit is better. (accept the null hypothesis)
#'#  In the traditional procedure, if p-value is less than 0.05 or 0.01 then we reject
#'#  the null hypothesis that our model fit to data well.
#'
#'
#'
#'
#'# Of course, even if p-values is small, we should not ignore our result.
#'# P value bitch is not so clear what it does and in frequentist methods,
#'# we experianced p value is bitch with respect to sample size.
#'# So, in Bayesian context, this bitch might be bitch with respect to ...
#'# Anyway, but ha....many statisticians like this bitch.
#'
#'
#'
#'
#'
#'
#'#   The 2-nd example uses  data named d
#'#========================================================================================
#'#                  1)  Fit a Model to  Data
#'#========================================================================================
#'
#'
#'
#'
#'                        fitt <- fit_Bayesian_FROC( ite  = 33,  dataList = d )
#'
#'
#'
#'
#'#========================================================================================
#'#  2)  Evaluate Posterior Predictive P value for the Goodness of Fit
#'#========================================================================================
#'
#'
#'
#'                                ppp(fitt)
#'
#'
#'
#'#  If this p value is greater, then we may say that our model is better.
#'
#'#  I made this ppp at 2019 August 25.
#'#  I cannot belive,..., now, one year will have gone 15 August 2020
#'
#'
#'#========================================================================================
#'#                             PPP is problematic
#'#========================================================================================
#'
#'# Consider the dataset:
#'
#'
#' dat <- list(c=c(4,3,2,1),    #     Confidence level. Note that c is ignored.
#'             h=c(77,97,32,31), #     Number of hits for each confidence level
#'             f=c(77,1,14,74),  #     Number of false alarms for each confidence level
#'
#'             NL=259,        #     Number of lesions
#'             NI=57,         #     Number of images
#'             C=4)           #     Number of confidence level#'
#'
#'
#'# Fit a model to the data
#'
#'
#'              fit <- fit_Bayesian_FROC(dat, ite = 33)
#'
#'
#'# calculate p value
#'
#'
#'
#'              ppp(fit)
#'
#'
#'
#'# In our model, we expect the monotonicity condition, namely
#'#
#'#    h[1] > h[2] > h[3] > h[4]
#'#    f[1] < f[2] < f[3] < f[4]
#'#
#'#  However the above dataset is far from this condition, and it results the
#'#  above undesired p value.
#'#   Revised 2019 Sept 7; 2020 Aug
#'# Of course it is no need to satisfy this monotonicity precisely, but good data
#'# would satisfy it.
#'# Since physician will (false positive) diagnose more correctly
#'# if his high confidence is greater.
#'
#'
#'
#'
#'      Close_all_graphic_devices() # 2020 August
#'
#'
#'
#'
#'}
#'
#'
#'
#'
ppp <- function(StanS4class,Colour=TRUE,dark_theme=TRUE,plot=TRUE ,summary = TRUE){
fit <-StanS4class
studyDesign <- fit@studyDesign

if (studyDesign == "MRMC")  ppp <- ppp_MRMC(fit,summary = summary)
if (studyDesign == "srsc.per.image"){  ppp <- ppp_srsc(fit,Colour=Colour,dark_theme=dark_theme,plot=plot,summary = summary);ppp <- ppp$p.value;} if (studyDesign == "srsc.per.lesion"){ ppp <- ppp_srsc(fit,Colour=Colour,dark_theme=dark_theme,plot=plot,summary = summary);ppp <- ppp$p.value;}

names(ppp) <- "Posterior Predictive P value for chi square goodness of fit"

if (summary==TRUE)return(ppp)
else invisible(ppp)
}

#' @title Calculates PPP for Models of a single reader and a single modality (Calculation is correct! :'-D)
#' @description Calculates Posterior Predictive P value for chi square (goodness of fit)
#'
#'
#'
#'
# 2020 Sept 17 ------
#' \strong{Appendix: p value}
#'
#'
#'
#'
#' In order to evaluate the goodness of fit of our model to the data, we used the so-called the posterior predictive p value.
#'
#' In the following, we use general conventional notations.
#' Let \eqn{y_{obs} } be an observed dataset and \eqn{f(y|\theta)} be a model (likelihood) for future dataset \eqn{y}. We denote a prior and a posterior distribution by \eqn{\pi(\theta)} and \eqn{\pi(\theta|y) \propto f(y|\theta)\pi(\theta)}, respectively.
#'
#' In our case, the data \eqn{y} is a pair of hits and false alarms; that is, \eqn{y=(H_1,H_2, \dots H_C; F_1,F_2, \dots F_C)} and \eqn{\theta = (z_1,dz_1,dz_2,\dots,dz_{C-1},\mu, \sigma)  }. We define the \eqn{\chi^2} discrepancy (goodness of fit statistics) to validate that our model fit the data.
#' \deqn{ T(y,\theta) := \sum_{c=1,.......,C} \biggr( \frac{\bigl(H_c-N_L\times p_c(\theta) \bigr)^2}{N_L\times p_c(\theta)}+  \frac{\bigl(F_c- q_{c}(\theta) \times N_{X}\bigr)^2}{ q_{c}(\theta) \times N_{X} }\biggr). }
#'
#'
#'
#' for a single reader and a single modality.
#'
#'
#' \deqn{   T(y,\theta) := \sum_{r=1}^R \sum_{m=1}^M \sum_{c=1}^C \biggr( \frac{(H_{c,m,r}-N_L\times p_{c,m,r}(\theta))^2}{N_L\times p_{c,m,r}(\theta)}+ \frac{\bigl(F_c- q_{c}(\theta) \times N_{X}\bigr)^2}{ q_{c}(\theta) \times N_{X} }\biggr).}
#'
#' for multiple readers and multiple modalities.
#'
#'
#'
#'
#' Note that \eqn{p_c} and \eqn{\lambda _{c}} depend on \eqn{\theta}.
#'
#'
#'
#'
#'
#' In classical frequentist methods, the parameter \eqn{\theta} is a fixed estimate, e.g., the maximal likelihood estimator. However, in a Bayesian context, the parameter is not deterministic. In the following, we show the p value in the Bayesian sense.
#'
#'
#' Let \eqn{y_{obs}} be an observed dataset (in an FROC context, it is hits and false alarms). Then, the so-called \emph{posterior predictive p value} is defined by
#'
#' \deqn{     p_value   = \int \int  \, dy\, d\theta\, I(  T(y,\theta) > T(y_{obs},\theta) )f(y|\theta)\pi(\theta|y_{obs})  }
#'
#'
#'
#' In order to calculate the above integral, let  \eqn{\theta_1,\theta _2, ......., \theta_i,.......,\theta_I} be samples from the posterior distribution of \eqn{ y_{obs} }, namely,
#'
#' \deqn{  \theta_1  \sim \pi(....|y_{obs} ),}
#' \deqn{ .......,}
#' \deqn{ \theta_i  \sim \pi(....|y_{obs} ),}
#' \deqn{ .......,}
#' \deqn{  \theta_I \sim \pi(....|y_{obs}  ).}
#'
#'
#' we obtain a sequence of models (likelihoods), i.e.,  \eqn{f(....|\theta_1),f(....|\theta_2),......., f(....|\theta_n)}.
#' We then draw the samples \eqn{y^1_1,....,y^i_j,.......,y^I_J }, such that each \eqn{y^i_j} is a sample from the distribution whose density function is \eqn{f(....|\theta_i)}, namely,
#'
#' \deqn{ y^1_1,.......,y^1_j,.......,y^1_J  \sim f(....|\theta_1),}
#' \deqn{.......,}
#' \deqn{ y^i_1,.......,y^i_j,.......,y^i_J  \sim f(....|\theta_i),}
#' \deqn{.......,}
#' \deqn{ y^I_1,.......,y^I_j,.......,y^I_J   \sim f(....|\theta_I).}
#'
#'
#' Using the Monte Carlo integral twice, we calculate the  integral of any function \eqn{ \phi(y,\theta)}.
#'
#' \deqn{ \int \int  \, dy\, d\theta\, \phi(y,\theta)f(y|\theta)\pi(\theta|y_{obs}) }
#' \deqn{\approx  \int \,  \frac{1}{I}\sum_{i=1}^I \phi(y,\theta_i)f(y|\theta_i)\,dy}
#' \deqn{    \frac{1}{IJ}\sum_{i=1}^I \sum_{j=1}^J \phi(y^i_j,\theta_i)}
#'
#'
#'
#'
#'
#'
#'
#'
#'
#' In particular, substituting \eqn{\phi(y,\theta):= I(  T(y,\theta) > T(y_{obs},\theta) ) } into the above equation,
#' we can approximate  the posterior predictive p value.
#'
#'
#' \deqn{    p_value   \approx  \frac{1}{IJ}\sum_i \sum_j  I(  T(y^i_j,\theta_i) > T(y_{obs},\theta_i) ) }
#'
#'
#'
#'
# 2020 Sept 17 ------
#'
#'
#'
#'
#'
#'
#'
#'
#'
#'
#'
#'
#'
#'
#'
#'
#'
#'
#'
#'
#'
#'
#'
#'
#'
#'
#'
#'
#'
#'
#'
#'
#'
#'
#'
#'
#'@inheritParams DrawCurves
#'@inheritParams  draw_latent_signal_distribution
#'@inheritParams fit_Bayesian_FROC

#' @author Issei Tsunoda, Prof. of Curlbus University, Mosquitobus and Gostbus univ. also. My technique of catch mosquitos are execellent, so, I am a prof. ha,, employ me. My health is bad, my life will be over.
#' @return A list, including p value and materials to calculate it.
#'
#'Contents of the list as a return values is the following:
#' \describe{
#' \item{ \code{FPF,TPF,..etc}      }{ data \eqn{y_{n,j} \sim likelihood ( . |\theta_n),}    }
#' \item{ \code{chisq_at_observed_data}      }{ \deqn{\chi (D|\theta_1), \chi (D|\theta_2), \chi (D|\theta_3),...,\chi (D|\theta_n),} }
#' \item{ \code{chisq_not_at_observed_data}  }{\deqn{\chi (y_1|\theta_1), \chi (y_2|\theta_2), \chi (y_3|\theta_3),...,\chi (y_n|\theta_n),  }}
#' \item{ \code{Logical}                     }{ The i-th component is a logical vector indicating whether  \deqn{\chi (y_2|\theta_2) > \chi (D|\theta_2)} is satisfied or not. Oppai ga Ippai. If \code{TRUE}, then the inequality holds.}
#' \item{ \code{p.value}                     }{  From the component \code{Logical}, we calculate the so-called \emph{Posterior Predictive P value}. Note that the author hate this notion!! I hate it!! Akkan Beeeee!!! }
#' }

#'@param plot Logical, whether replicated data are drawn, in the following notation, replicated data are denoted by \eqn{y_1,y_2,...,y_N}.
#'@param replicate.number.from.model.for.each.MCMC.sample A positive integer, representing \eqn{J} in the following notation.
#' @param plot_data A logical, whether data is plotted in the plot of data synthesized from the posterior predictive distribution
#' I cannot understand what I wrote in the past. My head is crazy cuz I was MCS, head inflammation maybe let me down.
#'
#'Suppose that  \deqn{\theta_1, \theta_2, \theta_3,...,\theta_N} are samples drawn in \eqn{N} times from posterior \eqn{\pi(\theta|D)} of given data \eqn{D}.
#'So, these \eqn{\theta_i;i=1,2,...} are contained in a stanfit object specified as the variable \code{StanS4class}.
#'
#'
#'Let \eqn{y_1,y_2,...,y_n} be samples drawn as the manner
#'
#'\deqn{y_1 \sim likelihood ( . |\theta_1), }
#'\deqn{y_2 \sim likelihood ( . |\theta_2),}
#'\deqn{y_3 \sim likelihood ( .|\theta_3),}
#'\deqn{...,}
#'\deqn{y_N \sim likelihood ( . |\theta_N).}
#'
#'We repeat this in \eqn{J} times, namely,
#'we draw the samples \eqn{y_{n,j},n=1,..,N;j=1,...,J} so that
#'
#'
#'
#'\deqn{y_{1,j} \sim likelihood ( . |\theta_1), }
#'\deqn{y_{2,j} \sim likelihood ( . |\theta_2),}
#'\deqn{y_{3,j} \sim likelihood ( .|\theta_3),}
#'\deqn{...,}
#'\deqn{y_{n,j} \sim likelihood ( . |\theta_n),}
#'\deqn{...,}
#'\deqn{y_{N,j} \sim likelihood ( . |\theta_N).}
#'
#'
#'Yes, the variable \code{replicate.number.from.model.for.each.MCMC.sample} means \eqn{J}!
#'We can write it more explicitly without abbreviation as follows.
#'
#'
#'\deqn{y_{1,1},y_{1,2},...,y_{1,j},...,y_{1,J} \sim likelihood ( . |\theta_1), }
#'\deqn{y_{2,1},y_{2,2},...,y_{2,j},...,y_{2,J} \sim likelihood ( . |\theta_2), }
#'\deqn{y_{3,1},y_{3,2},...,y_{3,j},...,y_{3,J} \sim likelihood ( . |\theta_3), }
#'\deqn{...,}
#'\deqn{y_{n,1},y_{n,2},...,y_{n,j},...,y_{n,J} \sim likelihood ( . |\theta_n), }
#'\deqn{...,}
#'\deqn{y_{N,1},y_{N,2},...,y_{N,j},...,y_{N,J} \sim likelihood ( . |\theta_N). }
#'
#'
#'
#'
#'Now, my body is not so good, so, I am tired. Cuz I counld not understand what I wrote, so I reviesed in 2020 Aug 9.
#'
#'You health is very bad condition, so, if the sentence is not clear, it is also for me! even if I wrote it! So, If I notice that my past brain is broken, then I will revise. Ha,,, I want be rest in peace.
#'
#'
#'
#'
#'
#'
#'

#' @details
#' In addition, this function plots replicated datasets from model at each MCMC sample generated by HMC.
#' Using the Hamiltonian Monte Carlo Sampling: HMC.
#' we can draw the MCMC
#' samples of size  \eqn{n}, say  \deqn{\theta_1, \theta_2, \theta_3,...,\theta_n },
#' namely,
#' \deqn{\theta_1  \sim \pi(.|D), }
#' \deqn{\theta_2 \sim \pi(.|D), }
#' \deqn{\theta_3  \sim \pi(.|D),}
#' \deqn{...,}
#' \deqn{\theta_n  \sim \pi(.|D).}
#'    where \eqn{\pi(\theta|D)} is the posterior for given data \eqn{D}.

#'
#'We draw samples as follows.
#'
#'
#'\deqn{y_{1,1},y_{1,2},...,y_{1,j},...,y_{1,J} \sim likelihood ( . |\theta_1), }
#'\deqn{y_{2,1},y_{2,2},...,y_{2,j},...,y_{2,J} \sim likelihood ( . |\theta_2), }
#'\deqn{y_{3,1},y_{3,2},...,y_{3,j},...,y_{3,J} \sim likelihood ( . |\theta_3), }
#'\deqn{...,}
#'\deqn{y_{n,1},y_{n,2},...,y_{n,j},...,y_{n,J} \sim likelihood ( . |\theta_n), }
#'\deqn{...,}
#'\deqn{y_{N,1},y_{N,2},...,y_{N,j},...,y_{N,J} \sim likelihood ( . |\theta_N).}
#'
#' Then we calculates the chi-squares for each sample.
#'
#'\deqn{ \chi(y_{1,1}|\theta_1), \chi(y_{1,2}|\theta_1), \chi(y_{1,3}|\theta_1),..., \chi(y_{1,j}|\theta_1),...., \chi(y_{1,J}|\theta_1),}
#'\deqn{ \chi(y_{2,1}|\theta_2), \chi(y_{2,2}|\theta_2), \chi(y_{2,3}|\theta_2),..., \chi(y_{2,j}|\theta_2),...., \chi(y_{2,J}|\theta_2),}
#'\deqn{ \chi(y_{3,1}|\theta_3), \chi(y_{3,2}|\theta_3), \chi(y_{3,3}|\theta_3),..., \chi(y_{3,j}|\theta_3),...., \chi(y_{3,J}|\theta_3),}
#'\deqn{...,}
#'\deqn{ \chi(y_{i,1}|\theta_i), \chi(y_{i,2}|\theta_i), \chi(y_{i,3}|\theta_i),..., \chi(y_{i,j}|\theta_i),...., \chi(y_{I,J}|\theta_i),}
#'\deqn{...,}
#'\deqn{ \chi(y_{I,1}|\theta_I), \chi(y_{I,2}|\theta_I), \chi(y_{I,3}|\theta_I),..., \chi(y_{I,j}|\theta_I),...., \chi(y_{I,J}|\theta_I).}
#'
#'
#'where \eqn{L ( . |\theta_i)} is a likelihood at parameter \eqn{\theta_i}.
#'
#'
#'
#' Let \eqn{ \chi(y|\theta)  } be a chi square goodness of fit statistics of our hierarchical Bayesian Model
#'
#'
#' \deqn{\chi(y|\theta) := \sum_{r=1}^R \sum_{m=1}^M \sum_{c=1}^C ( \frac { ( H_{c,m,r}-N_L\times p_{c,m,r})^2}{N_L\times p_{c,m,r}} + \frac{(F_{c,m,r}-(\lambda_{c} -\lambda_{c+1} )\times N_{L})^2}{(\lambda_{c} -\lambda_{c+1} )\times N_{L} }).}
#'
#'
#' and a chi square goodness of fit statistics of our non-hierarchical Bayesian Model
#'
#'\deqn{\chi(y|\theta) :=  \sum_{c=1}^C \biggr( \frac{( H_{c}-N_L\times p_{c})^2}{N_L\times p_{c}} + \frac{(F_{c}-(\lambda_{c} -\lambda_{c+1} ) )\times N_{L}]^2}{(\lambda_{c} -\lambda_{c+1} )\times N_{L} }\biggr).}
#'
#'where  a dataset \eqn{y} denotes \eqn{ (F_{c,m,r}, H_{c,m,r}) } in MRMC case and  \eqn{ (F_{c}, H_{c}) } in a single reader and a single modality case,
#'and model parameter \eqn{\theta}.
#'
#'
#' Then we can calculate the \emph{posterior predictive p value} for a given dataset \eqn{y_0}.
#'
#' \deqn{ \int \int I( \chi(y|\theta) > \chi(y_0|\theta) )   f(y|\theta) \pi(\theta|y_0) d \theta d y  }
#' \deqn{ \approx \int \sum_i I( \chi(y|\theta_i) > \chi(y_0|\theta_i) )   f(y|\theta_i) d y   }
#' \deqn{ \approx \sum_{j=1}^J \sum_{i=1}^I I( \chi(y_{i,j}|\theta_i) > \chi(y_0|\theta_i) )     }

#'
#'
#'
#'   When we plot these synthesized data-sets \eqn{y_{i,j}}, we use the \code{jitter()} which adds a small amount of noise to \strong{avoid overlapping points}.
#' For example, \code{jitter(c(1,1,1,1))} returns  values: \code{1.0161940 1.0175678 0.9862400 0.9986126}, which is changed
#' from \code{1,1,1,1} to be not exactly 1 by adding tiny errors to avoid overlapping. I love you. 2019 August 19
#' Nowadays, I cannot remove my self from some notion, such as honesty, or pain, or,.. maybe these thing is no longer with myself.
#' This programm is made to fix previous release calculation. Now, this programm calculates correct p value.
#'
#' So... I calculate the ppp for MCMC and Graphical User Interface based on Shiny for MRMC, which should be variable such as
#' number of readers, modalities, to generate such ID vectors automatically. Ha,... tired! Boaring, I want to die...t, diet!!
#' Tinko, tinko unko unko. Manko manko. ha.
#'
#' Leberiya, he will be die, ha... he cannot overcome, very old, old guy.
#'  I will get back to meet him. Or I cannot meet him? Liberiya,...very wisdom guy,
#' Ary you already die? I will get back with presents for you. Ball, I have to throgh ball, and he will catch it.
#'
#'
#' The reason why the author made the plot of data drawn from \strong{Posterior Predictive likelihoods with each MCMC parameters} is
#' to understand our programm is correct, that is, each drawing is very mixed. Ha,.... when wright this,... I always think who read it.
#' I love you, Ruikobach. Ruikobach is tiny and tiny, but,... cute. Ruikosan...Ruiko...
#'But he has time only several years. He will die, he lives sufficiently so long, ha.
#'
#'Using this function, user would get \strong{reliable posterior predictive p values}, Cheers! Pretty Crowd!
#'
#'
#'
#' We note that the calculation of posterior perdictive p value (PPP) relies on the law of large number.
#' Thus, in order to obtain the relicable PPP, we need to enough large MCMC samples to approximate
#' the double integral of PPP.
#' For example, the MCMC samples is small, then R hat is far from 1 but, the low MCMC samples leads
#' us to incorrect p value which sometimes said that the model is correct even if the R hat criteria
#' reject the MCMC results.
#'
#'
#'
#'
#'
#'
#'
#'
#'
#'
#'
#'
#'
#'
#'
#'
#'
#'
#'
#'
#'
#'
#'
#'
#'
#'
#'
#'
#'
#'
#'
#'
#'
#'
#'
#'
#'
#'
#'
#' @export
#'
#' @examples
#'
#'
#' \dontrun{
#'
#'#========================================================================================
#'#            1) Create a fitted model object with data named  "d"
#'#========================================================================================
#'
#'
#'
#' fit <- fit_Bayesian_FROC( dataList = d,
#'                               ite  = 222 # to restrict running time, but it is too small
#'                            )
#'
#'
#'
#'#========================================================================================
#'#            2) Calculate p value and meta data
#'#========================================================================================
#'
#'
#'
#'             ppp <- ppp_srsc(fit)
#'
#'
#'
#'#========================================================================================
#'#            3) Extract a p value
#'#========================================================================================
#'
#'
#'
#'
#'               ppp$p.value #' #' #'# Revised 2019 August 19 #'# Revised 2019 Nov 27 #' #' Close_all_graphic_devices() # 2020 August #'} #' #' #' ppp_srsc <-function( StanS4class, Colour = TRUE, dark_theme=TRUE, plot=TRUE, summary = FALSE, plot_data =TRUE, replicate.number.from.model.for.each.MCMC.sample =100# If this is not 100, then codes dose not show the process indicator,,,why??#2019 Sept 8 ){ fit <- StanS4class if (fit@studyDesign =="srsc.per.image") ModifiedPoisson <- FALSE if (fit@studyDesign =="srsc.per.lesion") ModifiedPoisson <- TRUE C <- fit@dataList$C
N<-C
c<-C:1
NL <- fit@dataList$NL NI <- fit@dataList$NI

if (ModifiedPoisson) NX<-NL#2020Feb
if (!ModifiedPoisson) NX<-NI#NX   2020Feb----

e <-rstan::extract(fit)

w <-vector()
z <- list()
dz <- list()
p <- list()
hitrate <- list()

l <- list()
dl <- list()
m <- vector()
v  <- vector()
a <- vector()
b  <- vector()
dataList <-list()
h <- list()
f <- list()
h.per <- list()
f.per <- list()
estimates <-list()
# chisq_Not_at_observed_data<- vector()
extract.TRUE.list<- list()#2019 Sept 8
Logical.list <- list() #2019 Sept 8
chisq_at_observed_data <- chi_square_goodness_of_fit(
fit,dig = 4,h=fit@dataList$h,f=fit@dataList$f

)
MCMC <- length(e$w) h.list <- list()#2019 Sept 8 f.list <- list()#2019 Sept 8 h.per.list <- list()#2019 Sept 8 f.per.list <- list()#2019 Sept 8 Logical.list <- list()#2019 Sept 8 # extract.TRUE.list <- list()#2019 Sept 8 FPF <- list()#2019 Sept 8 TPF <- list()#2019 Sept 8 chisq_Not_at_observed_data.list <- list()#2019 Sept 8 # process indicator --------- message("\n* Process for calculation of the posterior predictive p-value. \n")#Processssssssssssssssssssssssssssssssssssssssssssssssssssss cat(0," -> ")#Adjust#Processssssssssssssssssssssssssssssssssssssssssssssssssssss sss <-0 #dummy for process indicator#ProcesssssssssProcessssssssssssssssssssssssssssssssssssssssssssssssssssss Divisor <-100#ProcesssssssssProcessssssssssssssssssssssssssssssssssssssssssssssssssssss if(replicate.number.from.model.for.each.MCMC.sample<100){ Divisor <- 1 }#ProcesssssssssProcessssssssssssssssssssssssssssssssssssssssssssssssssssss ttt <-1#ProcesssssssssProcessssssssssssssssssssssssssssssssssssssssssssssssssssss uuu <- c(" ",#ProcesssssssssProcessssssssssssssssssssssssssssssssssssssssssssssssssssss " ",#ProcesssssssssProcessssssssssssssssssssssssssssssssssssssssssssssssssssss " ",#ProcesssssssssProcessssssssssssssssssssssssssssssssssssssssssssssssssssss " ",#ProcesssssssssProcessssssssssssssssssssssssssssssssssssssssssssssssssssss " ",#ProcesssssssssProcessssssssssssssssssssssssssssssssssssssssssssssssssssss " ",#ProcesssssssssProcessssssssssssssssssssssssssssssssssssssssssssssssssssss " ",#ProcesssssssssProcessssssssssssssssssssssssssssssssssssssssssssssssssssss " ",#ProcesssssssssProcessssssssssssssssssssssssssssssssssssssssssssssssssssss " ",#ProcesssssssssProcessssssssssssssssssssssssssssssssssssssssssssssssssssss ""#ProcesssssssssProcessssssssssssssssssssssssssssssssssssssssssssssssssssss )#ProcesssssssssProcessssssssssssssssssssssssssssssssssssssssssssssssssssss #seed ---- for (seed in 1:replicate.number.from.model.for.each.MCMC.sample) {#2019 Sept 8 # process indicator --------- if(seed %% round(replicate.number.from.model.for.each.MCMC.sample/Divisor)==0){ #ProcesssssssssProcessssssssssssssssssssssssssssssssssssssssssssssssssssss sss <- sss +1 #ProcesssssssssProcessssssssssssssssssssssssssssssssssssssssssssssssssssss if(sss%%10==0){ message(paste(" [ ", sss,uuu[ttt]," % ] ",sep = "")) #ProcesssssssssProcessssssssssssssssssssssssssssssssssssssssssssssssssssss ttt<-ttt+1 } if(!sss==100){cat(sss%%10," -> ")} #ProcesssssssssProcessssssssssssssssssssssssssssssssssssssssssssssssssssss }#Processssssssssssssssssssssssssssssssssssssssssssssssssssss set.seed(seed = seed) chisq_Not_at_observed_data.list[[seed]] <- vector()#2019 Sept 8 Logical.list[[seed]] <- vector()#2019 Sept 8 for (mcmc in 1:MCMC) { #Processsssssss w[mcmc]<-e$w[mcmc]
z[[mcmc]]<-e$z[mcmc,] p[[mcmc]]<-e$p[mcmc,]
hitrate[[mcmc]] <- t(apply(e$p,hit_rate_adjusted_from_the_vector_p,MARGIN = 1))[mcmc,] dz[[mcmc]]<-e$dz[mcmc,]
l[[mcmc]]<-e$l[mcmc,] dl[[mcmc]]<-e$dl[mcmc,]
m[mcmc]<-e$m[mcmc] v[mcmc]<-e$v[mcmc]
a[mcmc]<-e$a[mcmc] b[mcmc]<-e$b[mcmc]

h[[mcmc]] <-vector()
f[[mcmc]] <-vector()
h.per[[mcmc]] <-vector()
f.per[[mcmc]] <-vector()
s <-0# 2020 Feb This is a key idea

for (n in 1:N) {
if(!n==1)  s <- s + h[[mcmc]][n-1]# 2020 Feb This is a key idea

h[[mcmc]][n] <- stats::rbinom(1, NL-s, hitrate[[mcmc]][c[n]])
# f[[mcmc]][n] <- stats::rpois(1,dl[[mcmc]][c[n]]*NI)#2020Feb
f[[mcmc]][n] <- stats::rpois(1,dl[[mcmc]][c[n]]*NX)#2020Feb

}

chisq_Not_at_observed_data.list[[seed]][mcmc]<- chi_square_goodness_of_fit_from_input_all_param(
h= h[[mcmc]],
f= f[[mcmc]],
# p=p[[mcmc]],
p=hitrate[[mcmc]],
lambda=l[[mcmc]],
NI=NI,
NL=NL,
ModifiedPoisson=ModifiedPoisson,
dig = 4
)

# browser()

for (n in 1:N) {
h.per[[mcmc]][c[n]] <- h[[mcmc]][c[n]]/NL
# f.per[[mcmc]][c[n]] <- f[[mcmc]][c[n]]/NI#2020Feb
f.per[[mcmc]][c[n]] <- f[[mcmc]][c[n]]/NX #2020Feb

}

}#for (mcmc in 1:MCMC) {
# Inequality of comparison of chisq -----
Logical.list[[seed]] <- chisq_Not_at_observed_data.list[[seed]]>chisq_at_observed_data
# extract.TRUE.list[[seed]] <-the_row_number_of_logical_vector( Logical.list[[seed]])

h.list[[seed]] <-  h #2019 Sept 8
f.list[[seed]] <- f#2019 Sept 8
h.per.list[[seed]] <-h.per#2019 Sept 8
f.per.list[[seed]] <-f.per#2019 Sept 8

# Logical.list[[seed]] <-Logical#2019 Sept 8
# extract.TRUE.list[[seed]] <-extract.TRUE#2019 Sept 8

FPF[[seed]] <-  lapply(f.per, cumsum)
TPF[[seed]] <-  lapply(h.per, cumsum)

}#for  j in 1:replicate.number.from.model.for.each.MCMC.sample

# extract.TRUE  <-  unlist(extract.TRUE.list)
Logical<-  unlist(Logical.list)

# p.value <- length(extract.TRUE)/(MCMC*replicate.number.from.model.for.each.MCMC.sample)
# p.value -----
p.value <- mean(Logical) # This counts the rate of TRUE and it corresponds the double integral

main <- paste("P value = ", p.value)

if(p.value < 0.05){  main <-  substitute(paste("P value = ", p.value, ", model is rejected  (';", omega,";)" ))} # 2019 Jun 22 demo(plotmath)
if(p.value < 0.01 && p.value >= 0.05){  main <-  substitute(paste("P value = ", p.value, ", model may be rejected  (';", omega,";)" ))} # 2019 Jun 22 demo(plotmath)
if(p.value >= 0.01&& p.value < 0.5){  main <-  substitute(paste("P value = ", p.value, ", model is not rejected  ('",degree, omega,degree,")" ))} # 2019 Jun 22 demo(plotmath)
if(p.value < 0.7 && p.value >= 0.5){  main <-  substitute(paste("P value = ", p.value, ", model is not rejected  (",minute,hat(""), omega,hat(""),")" ))} # 2019 Jun 22 demo(plotmath)
if(  p.value >= 0.7){  main <-  substitute(paste("P value = ", p.value, ", model is not rejected  (",minute, omega,")" ))} # 2019 Jun 22 demo(plotmath)

if(plot==TRUE){

message(" Now, we plot the replicated datasets... wait...")

upper_lim_x <- max(unlist(FPF))
# upper_lim_y <- max(1,unlist(TPF))
upper_lim_y <- max(unlist(TPF))

if (dark_theme==TRUE)   dark_theme()

FPF <- jitter(unlist(FPF))
TPF <- jitter(unlist(TPF))
BayesianFROC::small_margin()

plot(FPF,
TPF,main =main,
cex=0.2,
xlim = c(0, upper_lim_x),
ylim = c(0, upper_lim_y )
)
graphics::abline(h=1)

if (fit@studyDesign=="srsc.per.image")    xlab = 'Replicated false positives per images from the posterior predictive distribution.'
if (fit@studyDesign=="srsc.per.lesion" )    xlab = 'Replicated false positives per nodule from the posterior predictive distribution.'
ylab<- "Replicated cumulative hits per lesion"
# main <-"Replicated FROC data from the posterior predictive distribuion."

# plot colored plot ----
if (Colour==TRUE) {

#Making a dataset to draw a counter plot
x <-unlist(FPF)
y<- unlist(TPF)
names(x)<-"CFPs from the posterior predictive distribution"
names(y)<-"CTPs from the posterior predictive distribution"
group <-  rep( 1:C, length(FPF) )

# plot(x,y, pch=20, cex=0.8, col=grDevices::rainbow(C+6)[group])

BayesianFROC::small_margin()
plot(x,y, pch=20, col=grDevices::rainbow(C+6, alpha=0.2)[group],
cex=0.8, xlab=xlab, ylab=ylab, main =main,
xlim = c(0, upper_lim_x),
ylim = c(0,upper_lim_y))
# car::dataEllipse(x,y,factor(group), levels=c(0.70,0.85,0.95),lwd=0.1,
#                  plot.points=FALSE, col=grDevices::rainbow(C+6), group.labels=NA, center.pch=FALSE)

}#Colour==TRUE

# from here2020 AUgust
if(plot_data){
suppressWarnings(graphics::par(new=TRUE));
plot(fit@metadata$ff, fit@metadata$hh,col="red",
cex=1,
xlim = c(0, upper_lim_x),
ylim = c(0, upper_lim_y )
)
suppressWarnings(graphics::par(new=TRUE));
plot(fit@metadata$ff, fit@metadata$hh,col="red",
cex=2,
xlim = c(0, upper_lim_x),
ylim = c(0, upper_lim_y )
)
suppressWarnings(graphics::par(new=TRUE));
plot(fit@metadata$ff, fit@metadata$hh,col="blue",
cex=2.5,
xlim = c(0, upper_lim_x),
ylim = c(0, upper_lim_y )
)
suppressWarnings(graphics::par(new=TRUE));
plot(fit@metadata$ff, fit@metadata$hh,col="green",
cex=3.5,
xlim = c(0, upper_lim_x),
ylim = c(0, upper_lim_y )
)

# from here2020 AUgust
}

#plot froc curve -----
DrawCurves(fit,upper_x = upper_lim_x, upper_y = upper_lim_y,new.imaging.device = FALSE,title= FALSE)

}#plot==TRUE

if (summary==TRUE){

return( list(
p.value=p.value,

Logical=Logical,
chisq_at_observed_data=chisq_at_observed_data,

chisq_Not_at_observed_data.list=chisq_Not_at_observed_data.list,
# q.value=q.value,

FPF=FPF,
TPF=TPF,
h.list=h.list, #2019 Sept 8
f.list=f.list,#2019 Sept 8
h.per.list=h.per.list,#2019 Sept 8
f.per.list=f.per.list,#2019 Sept 8

Logical.list=Logical.list,#2019 Sept 8
extract.TRUE.list=extract.TRUE.list#2019 Sept 8

)
)}

invisible( list(
p.value=p.value,

Logical=Logical,
chisq_at_observed_data=chisq_at_observed_data,

chisq_Not_at_observed_data.list=chisq_Not_at_observed_data.list,
# q.value=q.value,
FPF=FPF,
TPF=TPF,

h.list=h.list, #2019 Sept 8
f.list=f.list,#2019 Sept 8
h.per.list=h.per.list,#2019 Sept 8
f.per.list=f.per.list,#2019 Sept 8

Logical.list=Logical.list,#2019 Sept 8
extract.TRUE.list=extract.TRUE.list#2019 Sept 8

)
)

}#function

#' @title MRMC: Posterior Predictive P value (PPP) for MRMC,
#' @description PPP for chi square goodness of fit statistic
#' @details The author hates the notion of p value and this is the motivation that he developed new theory without p values.
#' However, he cannot overcome the traditional people. he loves mathematics, but he hates statistics.
#' he emphasizes that notion of p value is dangerous (monotonicity w.r.t. sample size) and its background is unknown.
#' Of course, intuitively, it is good. But, the theoritically, it does not ensure some criterion in large sample comtext.
#'
#' So, p value said that my effort is rarely admissible, since its p value said that he is small for various datasets.
#' So, this funcking p value said my effort is wrong, or should change  model.
#' Unfortunately, my hand aches cannot program more models.
#'  Ha,... why many peoply like p value bitch.
#'
#'
#'
#'
#'
#'
#'
#'
#'
#'
#'@inheritParams fit_Bayesian_FROC
#'@inheritParams ppp_srsc

#'@inheritParams DrawCurves
#'
#' @return A positive number indicates Posterior Predictive P value (ppp).
#' @export
#'
#' @examples
#'
#'
#'
#' \dontrun{
#'#========================================================================================
#'#  1)  Fit a Model to MRMC Data
#'#========================================================================================
#'
#'   fit <- fit_Bayesian_FROC( ite  = 33,  dataList = ddd )
#'
#'#========================================================================================
#'#  1)  Evaluate Posterior Predictive P value for the Goodness of Fit
#'#========================================================================================
#'
#'      ppp_MRMC(fit)
#'
#'#  If this quantity is greater, then we may say that our model is better.
#'
#'#  I made this ppp at 2019 August 25.
#'
#'      Close_all_graphic_devices() # 2020 August
#'}#'
#'
#'
#'
#'
ppp_MRMC <- function(StanS4class,
summary = TRUE,
replicate.number.from.model.for.each.MCMC.sample =2#2019 Sept 8

){

fit <-StanS4class

b <-Chi_square_goodness_of_fit_in_case_of_MRMC_Posterior_Mean(fit,summary = summary)
chi.square.observed.data.and.MCMC.samples <-b$chi.square a.list <- list() c.list <- list() chi.square.replicated.data.and.MCMC.samples <- list() for (seed in 1:replicate.number.from.model.for.each.MCMC.sample) {#2019 Sept 8 a.list[[seed]]<-vector() c.list[[seed]]<-vector() set.seed(seed = seed) a.list[[seed]] <- chi_square_at_replicated_data_and_MCMC_samples_MRMC(fit,seed = seed,serial.number =replicate.number.from.model.for.each.MCMC.sample ) chi.square.replicated.data.and.MCMC.samples[[seed]] <-a.list[[seed]]$chi.square
c.list[[seed]] <- chi.square.replicated.data.and.MCMC.samples[[seed]] - chi.square.observed.data.and.MCMC.samples

}#2019 Sept 8

c<-unlist(c.list)

d <- c>0

ppp <- mean(d)
return(ppp)
}
`

Try the BayesianFROC package in your browser

Any scripts or data that you put into this service are public.

BayesianFROC documentation built on Jan. 13, 2021, 5:22 a.m.