#' Bayesian Binomial Rate Parameter Inference
#'
#'
#' Given binomial frequency data, provides a Bayesian analysis for the
#' population binomial rate parameter.
#' @param n1 Integer number of binomial observations for a category 1 response (\emph{e.g.}, the number of successes)
#' @param n2 Integer number of binomial observations for a category 2 response (\emph{e.g.}, the number of failures)
#' @param a0 The first shape parameter for the prior beta distribution that corresponds to the population binomial parameter (default is 1). Must be positive and finite.
#' @param b0 The second shape parameter for the prior beta distribution for the population binomial rate parameter (default is 1). Must be positive and finite.
#' @param prob_interval Probability within interval estimates for the population binomial rate parameter (default is .95)
#'
#' @return A list containing the following components:
#' @return \item{n1}{Observed number of category 1 responses}
#' @return \item{n2}{Observed number of category 2 responses}
#' @return \item{a0}{First shape parameter for the prior beta distribution of the binomial rate parameter}
#' @return \item{b0}{Second shape parameter for the prior beta distribution of the binomial rate parameter}
#' @return \item{prob_interval}{Probability within interval estimates for the population binomial rate parameter}
#' @return \item{a_post}{First shape parameter for the posterior beta distribution for the binomial rate parameter}
#' @return \item{b_post}{Second shape parameter for the posterior beta distribution for the binomial rate parameter}
#' @return \item{phimean}{Mean of the posterior beta distribution for the binomial rate parameter}
#' @return \item{phimedian}{Median of the posterior beta distribution for the binomial rate parameter}
#' @return \item{phimode}{Mode of the posterior beta distribution for the binomial rate parameter}
#' @return \item{eti_lower}{Lower limit for the posterior equal-tail interval that has the probability stipulated in the \code{prob_interval} argument}
#' @return \item{eti_upper}{Upper limit for the posterior equal-tail interval that has the probability stipulated in the \code{prob_interval} argument}
#' @return \item{hdi_lower}{Lower limit for the posterior highest-density interval that has the probability stipulated in the \code{prob_interval} argument}
#' @return \item{hdi_upper}{Upper limit for the posterior highest-density interval that has the probability stipulated in the \code{prob_interval} argument}
#'
#' @details
#'
#' The binomial distribution with size = \eqn{n} and probability = \eqn{\phi} has
#' discrete probabilities
#' \deqn{p(x) = \frac{n!}{z!(n - x!)}\phi^{x}(1-\phi)^{n-x}}
#' where x is an integer from 0 to \eqn{n} in steps of 1. The binomial model
#' assumes a Bernoulli process of independent trials where there are binary
#' outcomes that have the same probability (say, \eqn{\phi}) for a response in
#' one of the two categories and a probability of \eqn{1-\phi} for the other
#' category. Before any data are collected, there are \eqn{n + 1} possible
#' values for \eqn{x} number of outcomes in category 1 and \eqn{n - x} number of
#' outcomes in category 2. The binomial distribution is a likelihood
#' distribution. A likelihood is the probability of an outcome given a specific
#' value for the population rate parameter. Yet for real applications, the
#' population parameter is not known. All that is known are the outcomes
#' observed from a set of binomial trials. The binomial inference problem is to
#' estimate the population \eqn{\phi} parameter based on the sample data.
#'
#' The frequentist approach to statistics is based on the relative frequency
#' method of assigning probability values (Ellis, 1842). From this framework,
#' there are no probabilities for anything that does not have a relative
#' frequency (von Mises, 1957). In frequency theory, the \eqn{\phi} parameter
#' does not have a relative frequency, so it cannot have a probability
#' distribution. From a frequentist framework, a value for the binomial rate
#' parameter is \emph{assumed}, and there is a discrete distribution for the \eqn{n + 1}
#' outcomes for \eqn{x} from 0 to \eqn{n}. The discrete likelihood distribution
#' has relative frequency over repeated experiments. Thus, for the frequentist
#' approach, \eqn{x} is a random variable, and \eqn{\phi} is an unknown fixed
#' constant. Frequency theory thus delibrately eschews the idea of the binomial
#' rate parameter having a probability distribution. Laplace (1774) had
#' previously employed a Bayesian approach of treating the \eqn{\phi} parameter
#' as a random variable. Yet Ellis and other researchers within the frequentist
#' tradition delibrately rejected the Bayes/Laplace approach. For tests of a
#' null hypothesis of an assumed \eqn{\phi} value, the frequentist approach either
#' continues to assume the null hypothesis or it rejects the null hypothesis
#' depending on the likelihood of the observed data plus the likelihood of more
#' extreme unobserved outcomes. The confidence interval is the range of \eqn{\phi}
#' values where the null hypothesis of specific \eqn{\phi} values would be
#' retained given the observed data (Clopper & Pearson, 1934). However, the
#' frequentist confidence interval is not a probability interval since
#' population parameters cannot have a probability distribution with frequentist
#' methods. Frequentist statisticians were well aware (\emph{e.g.}, Pearson, 1920)
#' that if the \eqn{\phi} parameter had a distribution, then the Bayes/Laplace
#' approach would be correct.
#'
#' Bayesian statistics rejects the frequentist theoretical decisions as to what
#' are the fixed constants and what is the random variable that can take on a
#' range of values. From a Bayesian framework, probability is anything that
#' satisfies the Kolmogorov (1933) axioms, so probabilities need not be limited
#' to processes that have a relative frequency. Importantly, probability can be
#' a measure of information or knowledge provided that the probability
#' representation meets the Kolmogorov axioms (De Finetti, 1974). Given binomial
#' data, the population binomial rate parameter \eqn{\phi} is unknown, so it is
#' represented with a probability distribution for its possible values. This
#' assumed distribution is the prior distribution. Furthermore, the quantity \eqn{x}
#' for the likelihood distribution above is not a random variable once the
#' experiment has been conducted. If there are \eqn{n_1} outcomes for category 1
#' and \eqn{n_2 = n-n_1} outcomes in category 2, then these are fixed values.
#' While frequentist methods compute both the likelihood of the observed
#' outcome \emph{and} the likelihood for unobserved outcomes that are more
#' extreme, in Bayesian inference it is \emph{only} the likelihood of the observed
#' outcome that is computed. From the Bayesian perspective, the inclusion of
#' unobserved outcomes in the analysis violates the likelihood principle (Berger
#' & Wolpert, 1988). A number of investigators have found paradoxes with
#' frequentist procedures when the likelihood principle is not used (\emph{e.g.},
#' Lindley & Phillips, 1976; Chechile, 2020). The Bayesian practice of strictly
#' computing only the likelihood of the observed data produces the result that
#' the likelihood for the binomial is proportional to \eqn{\phi^{n_1}(1 - \phi)^{n_2}}.
#' In Bayesian statistics, the proportionality constant is not needed because it
#' appears in both the numerator and the denominator of Bayes theorem and thus
#' cancels. See Chechile (2020) for more extensive comparisons between
#' frequentist and Bayesian approaches with a particular focus on the binomial
#' model.
#'
#' Given a beta distribution prior for the binomial \eqn{\phi} parameter, it has
#' been shown that the resulting posterior distribution from Bayes theorem is
#' another member of the beta family of distributions (Lindley & Phillips, 1976).
#' This property of the prior and posterior being in the same distributional
#' family is called \emph{conjugacy}. The beta distribution is a natural Bayesian
#' conjugate function for all Bernoulli processes where the likelihood is
#' proportional to \eqn{\phi^{n_1}(1 - \phi)^{n_2}} (Chechile, 2020).
#' The density function for a beta variate is
#' \deqn{f(x) = \begin{cases} Kx^{a-1}(1-x)^{b-1} & \quad \textrm{if } 0 \le x \le 1, \\0 & \quad \textrm{otherwise} \end{cases}}
#' where \deqn{K = \frac{\Gamma(a + b)}{\Gamma(a)\Gamma(b)}}
#' (Johnson, Kotz, & Balakrishnan, 1995). The two shape parameters \eqn{a} and \eqn{b}
#' must be positive values. If the beta prior shape parameters are a0 and b0,
#' then the posterior beta shape parameters are \eqn{a_{post} = a_0 + n_1} and
#' \eqn{b_{post} = b_0 + n_2}. The default prior for the \code{dfba_binomial()}
#' function is \code{a0 = b0 = 1}, which corresponds to the uniform prior.
#'
#' Thus, the Bayesian inference for the unknown binomial rate parameter \eqn{phi}
#' is the posterior beta distribution with shape parameters of \code{a_post} and
#' \code{b_post}. The \code{dfba_binomial()} function calls the
#' \code{dfba_beta_descriptive()} function to find the centrality point estimates
#' (\emph{i.e.}, the mean, median, and mode) and to find two interval estimates
#' that contain the probability specified in the \code{prob_interval} argument.
#' One interval has equal-tail probabilities and the other interval is the
#' highest-density interval. Users can use the \code{dfba_beta_bayes_factor()}
#' function to test hypotheses about the \eqn{\phi} parameter.
#' @references
#'
#' Berger, J. O., & Wolpert, R. L. (1988). The Likelihood Principle (2nd ed.)
#' Hayward, CA: Institute of Mathematical Statistics.
#'
#' Chechile, R. A. (2020). Bayesian Statistics for Experimental Scientists: A
#' General Introduction Using Distribution-Free Statistics. Cambridge: MIT Press.
#'
#' Clopper, C. J., & Pearson, E. S. (1934). The use of confidence or fiducial
#' limits illustrated in the case of the binomial. Biometrika, 26, 404-413.
#'
#' De Finetti, B. (1974). Bayesianism: Its unifying role for both the
#' foundations and applications of statistics. International Statistical Review/
#' Revue Internationale de Statistique, 117-130.
#'
#' Ellis, R. L. (1842). On the foundations of the theory of probability.
#' Transactions of the Cambridge Philosophical Society, 8, 1-6.
#'
#' Johnson, N. L., Kotz S., and Balakrishnan, N. (1995). Continuous Univariate
#' Distributions, Vol. 1, New York: Wiley.
#'
#' Kolmogorov, A. N. (1933/1959). Grundbegriffe der Wahrcheinlichkeitsrechnung.
#' Berlin: Springer. English translation in 1959 as Foundations of the Theory of
#' Probability. New York: Chelsea.
#'
#' Laplace, P. S. (1774). Memoire sr la probabilite des causes par les
#' evenements. Oeuvres complete, 8,5-24.
#'
#' Lindley, D. V., & Phillips, L. D. (1976). Inference for a Bernoulli process
#' (a Bayesian view). The American Statistician, 30, 112-119.
#'
#' Pearson, K. (1920). The fundamental problem of practical statistics.
#' Biometrika, 13(1), 1-16.
#'
#' von Mises, R. (1957). Probability, Statistics, and Truth. New York: Dover.
#'
#' @seealso
#'
#' \code{\link[stats:Distributions]{Distributions}} for details on the
#' functions included in the \strong{stats} regarding the beta and the binomial
#' distributions.
#'
#' \code{\link{dfba_beta_bayes_factor}} for further documentation about the
#' Bayes factor and its interpretation.
#'
#' \code{\link{dfba_beta_descriptive}} for advanced Bayesian descriptive methods
#' for beta distributions
#'
#' @examples
#' # Example using defaults of a uniform prior and 95% interval estimates
#' dfba_binomial(n1 = 16,
#' n2 = 2)
#'
#' # Example with the Jeffreys prior and 99% interval estimates
#' dfba_binomial(n1 = 16,
#' n2 = 2,
#' a0 = .5,
#' b0 = .5,
#' prob_interval = .99)
#'
#' @export
dfba_binomial <- function(n1,
n2,
a0=1,
b0=1,
prob_interval=.95){
# a0 and b0 have to be positive and finite
if(a0 <= 0|
a0 == Inf|
b0 <= 0|
b0 == Inf|
is.na(a0)|
is.na(b0)){
stop("Both a0 and b0 must be positive and finite")
}
# interval width has to be between 0 and 1
if(prob_interval >= 1|
prob_interval <= 0){
stop("prob_interval must be greater than 0 and less than 1")
}
# n's can't be negative
if(n1 < 0|
n2 < 0|
is.na(n1)|
is.na(n2)){
stop("Neither n1 nor n2 can be negative")
}
# n's must be integers
if (n1 != round(n1)|
n2 != round(n2)){
stop("n1 and n2 must be integers")
}
a_post <- n1 + a0
b_post <- n2 + b0
out_bin <- dfba_beta_descriptive(a_post,
b_post,
prob_interval=prob_interval)
bin_list<-list(n1 = n1,
n2 = n2,
a0 = a0,
b0 = b0,
prob_interval = prob_interval,
a_post = a_post,
b_post = b_post,
phimean = out_bin$x_mean,
phimedian = out_bin$x_median,
phimode = out_bin$x_mode,
eti_lower = out_bin$eti_lower,
eti_upper = out_bin$eti_upper,
hdi_lower = out_bin$hdi_lower,
hdi_upper = out_bin$hdi_upper)
new("dfba_binomial_out", bin_list)
}
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.