R/data.R
In triptych: Diagnostic Graphics to Evaluate Forecast Performance

#' Example data set of binary observations and probability forecasts
#' 
#' The forecasts X01 to X10 are generated in such a way that their discrimination ability
#' is neatly decreasing. In addition, X01 and X06 are "calibrated", X02 and X07 are "underconfident",
#' X03 and X08 are "overconfident", X04 and X09 exhibit "negative bias", and X05 and X10 exhibit "positive bias".
#' 
#' 
#' @details
#' The observations are generated from a Bernoulli distribution, where the
#' success probability is determined by ten sources of information. That is,
#' the probability is given by
#' \deqn{p = \Phi(\sum_{i = 1}^{10} Z_i),}
#' where \eqn{Z_i}, \eqn{i = 1, ..., 10,} are independent standard Gaussian random
#' variables, and \eqn{\Phi} denotes the cumulative distribution function of the
#' standard Gaussian distribution.
#' 
#' The corresponding forecasts are named in decreasing order of access to these
#' latent Gaussian variables (that is, information content). In a first step,
#' calibrated forecasts are generated by
#' \eqn{p[j] = \Phi(\frac{1}{j}\sum_{i = j}^{10} Z_i)}.
#' Subsequently, these probabilities are perturbed to introduce miscalibration
#' using the cumulative distribution function \eqn{F} of the beta distribution, yielding
#' the final forecasts
#' \deqn{X[j] = F(p[j]; a, b),}
#' where \eqn{a} and \eqn{b} are the positive shape parameters (see [pbeta()]).
#' 
#' @format
#' A data frame with 1,000 rows and 11 columns, generated as described in 'Details':
#' \describe{
#'   \item{y}{observations}
#'   \item{X01}{forecasts, full information, calibrated: \eqn{a = 1}, \eqn{b = 1}}
#'   \item{X02}{forecasts, less information than X01, underconfident: \eqn{a = 1/4}, \eqn{b = 1/4}}
#'   \item{X03}{forecasts, less information than X02, overconfident: \eqn{a = 4}, \eqn{b = 4}}
#'   \item{X04}{forecasts, less information than X03, negative bias: \eqn{a = 5/3}, \eqn{b = 3/5}}
#'   \item{X05}{forecasts, less information than X04, positive bias: \eqn{a = 3/5}, \eqn{b = 5/3}}
#'   \item{X06}{forecasts, less information than X05, calibrated: \eqn{a = 1}, \eqn{b = 1}}
#'   \item{X07}{forecasts, less information than X06, underconfident: \eqn{a = 1/4}, \eqn{b = 1/4}}
#'   \item{X08}{forecasts, less information than X07, overconfident: \eqn{a = 4}, \eqn{b = 4}}
#'   \item{X09}{forecasts, less information than X08, negative bias: \eqn{a = 5/3}, \eqn{b = 3/5}}
#'   \item{X10}{forecasts, least information, positive bias: \eqn{a = 2/3}, \eqn{b = 3/2}}
#' }
"ex_binary"