R/rootogram.R
In topmodels: Infrastructure for Forecasting and Assessment of Probabilistic Models

Documented in autoplot.rootogram c.rootogram plot.rootogram rootogram rootogram.default

# -------------------------------------------------------------------
# Programming outline: Rootogram
# -------------------------------------------------------------------
# - Observed y in-sample or out-of-sample (n x 1)
# - Breaks for observations (m x 1)
# - Predicted probabilities at breaks F_y(br1), ..., F_y(brm) (n x m)
#
# - Cut observations at breaks -> observed frequencies for (m-1) groups
# - Aggregate probabilities -> expected frequencies for (m-1) groups
# - Can be drawn in different style (standing vs. hanging / raw vs. sqrt)
#
# Functions:
# - rootogram() generic plus default method
# - Return object of class "rootogram" that is plotted by default
# - But has plot=FALSE so that suitable methods can be added afterwards
# - Methods: plot(), autoplot(), c()/rbind(), +
# -------------------------------------------------------------------


#' Rootograms for Assessing Goodness of Fit of Probability Models
#'
#' Rootograms graphically compare (square roots) of empirical frequencies with
#' expected (fitted) frequencies from a probabilistic model. If \code{plot = TRUE}, the
#' resulting object of class \code{"rootogram"} is plotted by
#' \code{\link{plot.rootogram}} or \code{\link{autoplot.rootogram}} before it is
#' returned, depending on whether the package \code{ggplot2} is loaded.
#'
#' Rootograms graphically compare frequencies of empirical distributions and
#' expected (fitted) probability models. For the observed distribution the histogram is
#' drawn on a square root scale (hence the name) and superimposed with a line
#' for the expected frequencies. The histogram can be \code{"hanging"} from the
#' expected curve (default), \code{"standing"} on the (like bars in barplot),
#' or drawn as a \code{"suspended"} histogram of deviations.
#'
#' The function \code{\link{rootogram}} leverages the \code{\link{procast}}
#' generic in order to compute all necessary coordinates based on observed and
#' expected (fitted) frequencies.
#'
#' In addition to the \code{plot} and \code{\link[ggplot2]{autoplot}} method for
#' rootogram objects, it is also possible to combine two (or more) rootograms by
#' \code{c}/\code{rbind}, which creates a set of rootograms that can then be
#' plotted in one go.
#'
#' @aliases rootogram rootogram.default c.rootogram rbind.rootogram
#'
#' @param object an object from which an rootogram can be extracted with
#' \code{\link{procast}}.
#' @param newdata an optional data frame in which to look for variables with
#' which to predict. If omitted, the original observations are used.
#' @param plot logical or character. Should the \code{plot} or \code{autoplot} method be called to
#' draw the computed extended reliability diagram? Logical
#' \code{FALSE} will suppress plotting, \code{TRUE} (default) will choose the
#' type of plot conditional if the package \code{ggplot2} is loaded.
#' Alternatively \code{"base"} or \code{"ggplot2"} can be specified to
#' explicitly choose the type of plot.
#' @param class should the invisible return value be either a \code{data.frame}
#' or a \code{tbl_df}. Can be set to \code{"data.frame"} or \code{"tibble"} to
#' explicitly specify the return class, or to \code{NULL} (default) in which
#' case the return class is conditional on whether the package \code{"tibble"}
#' is loaded.
#' @param response_type character. To set the default values for \code{breaks} and
#' \code{widths}.  Currently different defaults are available for \code{"discrete"}
#' and \code{"continous"} response distribution, as well as for the special case of a
#' \code{"logseries"} response.
#' @param breaks \code{NULL} (default) or numeric to manually specify the breaks for
#' the rootogram intervals. A single numeric (larger \code{0}) specifies the number of breaks
#' to be automatically chosen, multiple numeric values are interpreted as manually specified breaks.
#' @param width \code{NULL} (default) or single positive numeric. Width of the histogram bars.
#' @param style character specifying the syle of rootogram (see 'Details').
#' @param scale character specifying whether \code{"raw"} frequencies or their square
#' roots (\code{"sqrt"}; default) should be drawn.
#' @param expected logical or character. Should the expected (fitted) frequencies be plotted?
#' Can be set to \code{"both"} (same as \code{TRUE}; default), \code{"line"}, \code{"point"},
#' or \code{FALSE} which will suppress plotting.
#' @param confint logical, defaults to \code{TRUE}. Should confident intervals be drawn?
#' @param ref logical, defaults to \code{TRUE}. Should a reference line be plotted?
#' @param xlab,ylab,main graphical parameters forwarded to
#' \code{\link{plot.rootogram}} or \code{\link{autoplot.rootogram}}.
#' @param \dots further graphical parameters passed to the plotting function.
#'
#' @return An object of class \code{"rootogram"} inheriting from
#' \code{"data.frame"} or \code{"tibble"} conditional on the argument \code{class}
#' with the following variables:
#' \item{observed}{observed frequencies,}
#' \item{expected}{expected (fitted) frequencies,}
#' \item{mid}{histogram interval midpoints on the x-axis,}
#' \item{width}{widths of the histogram bars,}
#' \item{confint_lwr, confint_upr}{lower and upper confidence interval bound.} 
#'
#' Additionally, \code{style}, \code{scale}, \code{expected}, \code{confint},
#' \code{ref}, \code{xlab}, \code{ylab}, amd \code{main} are stored as attributes.
#'
#' @note Note that there is also a \code{\link[vcd]{rootogram}} function in the
#' \pkg{vcd} package that is similar to the \code{numeric} method provided
#' here. However, it is much more limited in scope, hence a function has been
#' created here.
#'
#' @seealso \code{\link{plot.rootogram}}, \code{\link{procast}}
#' @references Friendly M (2000), \emph{Visualizing Categorical Data}. SAS
#' Institute, Cary, ISBN 1580256600.
#'
#' Kleiber C, Zeileis A (2016).  \dQuote{Visualizing Count Data Regressions
#' Using Rootograms.} \emph{The American Statistician}, \bold{70}(3), 296--303.
#' \doi{10.1080/00031305.2016.1173590}.
#'
#' Tukey JW (1977). \emph{Exploratory Data Analysis}. Addison-Wesley, Reading,
#' ISBN 0201076160.
#'
#' @keywords hplot
#' @examples
#' ## plots and output
#'
#' ## number of deaths by horsekicks in Prussian army (Von Bortkiewicz 1898)
#' deaths <- rep(0:4, c(109, 65, 22, 3, 1))
#'
#' ## fit glm model
#' m1_pois <- glm(deaths ~ 1, family = poisson)
#' rootogram(m1_pois)
#'
#' ## inspect output (without plotting)
#' r1 <- rootogram(m1_pois, plot = FALSE)
#' r1
#'
#' ## combine plots
#' plot(c(r1, r1), col = c(1, 2), expected_col = c(1, 2))
#'
#'
#' #-------------------------------------------------------------------------------
#' ## different styles
#'
#' ## artificial data from negative binomial (mu = 3, theta = 2)
#' ## and Poisson (mu = 3) distribution
#' set.seed(1090)
#' y <- rnbinom(100, mu = 3, size = 2)
#' x <- rpois(100, lambda = 3)
#'
#' ## glm method: fitted values via glm()
#' m2_pois <- glm(y ~ x, family = poisson)
#'
#' ## correctly specified Poisson model fit
#' par(mfrow = c(1, 3))
#' r1 <- rootogram(m2_pois, style = "standing", ylim = c(-2.2, 4.8), main = "Standing")
#' r2 <- rootogram(m2_pois, style = "hanging", ylim = c(-2.2, 4.8), main = "Hanging")
#' r3 <- rootogram(m2_pois, style = "suspended", ylim = c(-2.2, 4.8), main = "Suspended")
#' par(mfrow = c(1, 1))
#'
#' #-------------------------------------------------------------------------------
#' ## linear regression with normal/Gaussian response: anorexia data
#'
#' data("anorexia", package = "MASS")
#'
#' m3_gauss <- glm(Postwt ~ Prewt + Treat + offset(Prewt), family = gaussian, data = anorexia)
#'
#' ## plot rootogram as "ggplot2" graphic
#' rootogram(m3_gauss, plot = "ggplot2")
#' @export
rootogram <- function(object, ...) {
  UseMethod("rootogram")
}


#' @rdname rootogram
#' @method rootogram default
#' @export
rootogram.default <- function(
                              ## computation arguments
                              object,
                              newdata = NULL,
                              plot = TRUE,
                              class = NULL,
                              response_type = NULL,
                              breaks = NULL,
                              width = NULL,

                              ## plotting arguments
                              style = c("hanging", "standing", "suspended"),
                              scale = c("sqrt", "raw"),
                              expected = TRUE,
                              confint = TRUE,
                              ref = TRUE,
                              xlab = NULL,
                              ylab = NULL,
                              main = NULL,
                              ...) {
  # -------------------------------------------------------------------
  # SET UP PRELIMINARIES
  # -------------------------------------------------------------------
  ## sanity checks
  ## * `object`, `newdata` w/i `newresponse()`
  ## * `expected`, `ref`, `confint`...` in `plot()` and `autoplot()`
  stopifnot(is.null(breaks) || (is.numeric(breaks) && length(breaks) > 0 && is.null(dim(breaks))))
  if (length(breaks) == 1) stopifnot(breaks >= 1)
  stopifnot(is.null(width) || (is.numeric(width) && length(width) == 1 && width > 0))
  stopifnot(is.null(xlab) || (length(xlab) == 1 && is.character(xlab)))
  stopifnot(is.null(ylab) || (length(ylab) == 1 && is.character(ylab)))
  stopifnot(is.null(main) || (length(main) == 1 && is.character(main)))
  stopifnot(isTRUE(plot) || isFALSE(plot) || (is.character(plot) && length(plot) == 1L))
  stopifnot(is.null(class) || (is.character(class) && length(class) == 1L))
  stopifnot(isTRUE(expected) || isFALSE(expected) || (is.character(expected) && length(expected) == 1))
  if (is.character(expected)) expected <- match.arg(expected, c("line", "point", "both"))
  stopifnot(isTRUE(confint) || isFALSE(confint))
  stopifnot(isTRUE(ref) || isFALSE(ref))

  ## match arguments
  scale <- match.arg(scale)
  style <- match.arg(style)

  ## default annotation
  if (is.null(xlab)) {
    xlab <- as.character(attr(terms(object), "variables"))[2L]
  }
  if (is.null(ylab)) {
    ylab <- if (scale == "raw") "Frequency" else "sqrt(Frequency)"
  }
  if (is.null(main)) {
    main <- deparse(substitute(object))
  }

  ## guess plotting flavor
  if (is.logical(plot)) {
      plot <- ifelse(isFALSE(plot), "none", if ("ggplot2" %in% .packages()) "ggplot2" else "base")
  }
  plot <- try(match.arg(plot, c("none", "base", "ggplot2")), silent = TRUE)
  if (inherits(plot, "try-error"))
    stop("`plot` must be logical `TRUE`/`FALSE` or one of \"none\", \"base\", or \"ggplot2\"")

  ## guess output class
  if (is.null(class)) {
    class <- if ("tibble" %in% .packages()) "tibble" else "data.frame"
  }
  class <- try(match.arg(class, c("tibble", "data.frame")), silent = TRUE)
  if (inherits(class, "try-error"))
    stop("`class` must be `NULL` or one of \"tibble\", \"data.frame\"")

  # -------------------------------------------------------------------
  # PREPARE DATA
  # -------------------------------------------------------------------
  ## get data and weights
  y <- newresponse(object, newdata = newdata, na.action = na.pass)
  w <- attr(y, "weights")
  if (is.null(response_type)) response_type <- attr(y, "response_type")
  response_type <- match.arg(response_type, c("discrete", "logseries", "continuous"))

  ## set breaks and midpoints
  ## TODO: (ML) Extend breaks to the left, in case still expected frequency exists.
  ## TODO: (Z) Try to get rid of 'response_type'.
  ## TODO: (RS) Thought the very same while going trough args und sanity checks :).
  if (is.null(breaks) && response_type == "discrete") {
    breaks <- -1L:max(y[w > 0]) + 0.5
  } else if (is.null(breaks) && response_type == "logseries") {
    breaks <- 0L:max(y[w > 0]) + 0.5
  } else if (is.null(breaks)) {
    breaks <- "Sturges"
  }

  breaks <- hist(y[w > 0], plot = FALSE, breaks = breaks)$breaks
  mid <- (head(breaks, -1L) + tail(breaks, -1L)) / 2

  ## fix pointmasses
  ## TODO: (ML) Check if that always works or could be improved.
  breaks[1] <- breaks[1] - 1e-12
  breaks[length(breaks)] <- breaks[length(breaks)] + 1e-12

  ## set widths
  ## TODO: (RS2ML) Question: if NULL and discrete/logseries it is set to 0.9,
  ##       if NULL to 1. What else? There is no default fallback.
  ##       There should be a dedicated 'else'.
  if (is.null(width) && (response_type == "discrete" || response_type == "logseries")) {
    width <- 0.9
  } else if (is.null(width)) {
    width <- 1
  }

  # -------------------------------------------------------------------
  # COMPUTATION OF EXPECTED AND OBSERVED FREQUENCIES
  # -------------------------------------------------------------------
  ## expected frequencies (part1)
  p <- procast(object, newdata = newdata, na.action = na.pass, type = "probability",
    drop = FALSE, at = breaks, elementwise = FALSE)
  p <- p[, -1L, drop = FALSE] - p[, -ncol(p), drop = FALSE]

  ## TODO: (ML) Sometime neg. expected occurs (see example for underdispersed model fit).
  p[abs(p) < sqrt(.Machine$double.eps)] <- 0

  ## handle NAs
  ## TODO: (ML) Maybe allow arg `na.action` in the future.
  idx_not_na <- as.logical(complete.cases(y) * complete.cases(p))
  y <- y[idx_not_na]
  p <- p[idx_not_na, ]
  w <- w[idx_not_na]

  ## observed frequencies
  val_observed <- as.vector(xtabs(w ~ cut(y, breaks, include.lowest = TRUE)))

  ## expected frequencies (part2)
  val_expected <- colSums(p * w)

  # -------------------------------------------------------------------
  # OUTPUT AND OPTIONAL PLOTTING
  # -------------------------------------------------------------------
  ## collect everything as data.frame
  rval <- data.frame(
    observed = if (scale == "raw") as.vector(val_observed) else sqrt(as.vector(val_observed)),
    expected = if (scale == "raw") as.vector(val_expected) else sqrt(as.vector(val_expected)),
    mid = mid,
    width = diff(breaks) * width
  )

  ## add attributes
  attr(rval, "style") <- style
  attr(rval, "scale") <- scale
  attr(rval, "expected") <- expected
  attr(rval, "confint") <- confint
  attr(rval, "ref") <- ref
  attr(rval, "xlab") <- xlab
  attr(rval, "ylab") <- ylab
  attr(rval, "main") <- main

  ## set class to data.frame or tibble
  if (class == "data.frame") {
    class(rval) <- c("rootogram", "data.frame")
  } else {
    rval <- tibble::as_tibble(rval)
    class(rval) <- c("rootogram", class(rval))
  }

  ## plot by default
  if (plot == "ggplot2") {
    try(print(ggplot2::autoplot(rval, ...)))
  } else if (plot == "base") {
    try(plot(rval, ...))
  }

  ## return invisibly
  invisible(rval)
}


#' @export
c.rootogram <- function(...) {
  # -------------------------------------------------------------------
  # GET DATA
  # -------------------------------------------------------------------
  ## list of rootograms
  rval <- list(...)

  ## set class to tibble if any rval is a tibble
  if (any(do.call("c", lapply(rval, class)) %in% "tbl")) {
    class <- "tibble"
  } else {
    class <- "data.frame"
  }

  ## remove temporary the class (needed below for `c()`)
  ## TODO: (ML) Rewrite by, e.g., employing `lapply()`.
  for (i in 1:length(rval)) class(rval[[i]]) <- class(rval[[i]])[!class(rval[[i]]) %in% "rootogram"]

  ## convert always to data.frame
  if (class == "tibble") {
    rval <- lapply(rval, as.data.frame)
  }

  ## group sizes
  for (i in seq_along(rval)) {
    if (is.null(rval[[i]]$group)) rval[[i]]$group <- 1L
  }
  n <- lapply(rval, function(r) table(r$group))

  # -------------------------------------------------------------------
  # PREPARE DATA
  # -------------------------------------------------------------------
  ## labels
  xlab <- prepare_arg_for_attributes(rval, "xlab")
  ylab <- prepare_arg_for_attributes(rval, "ylab")
  nam <- names(rval)
  main <- if (is.null(nam)) {
    prepare_arg_for_attributes(rval, "main")
  } else {
    make.unique(rep.int(nam, sapply(n, length)))
  }

  ## parameters
  style <- prepare_arg_for_attributes(rval, "style", force_single = TRUE)
  scale <- prepare_arg_for_attributes(rval, "scale", force_single = TRUE)
  expected <- prepare_arg_for_attributes(rval, "expected")
  ref <- prepare_arg_for_attributes(rval, "ref")
  confint <- prepare_arg_for_attributes(rval, "confint")

  ## fix `ylabel` according to possible new `scale`
  if (scale == "sqrt") {
    ylab[grepl("^Frequency$", ylab)] <- "sqrt(Frequency)"
  } else if (scale == "raw") {
    ylab[grepl("^sqrt\\(Frequency\\)$", ylab)] <- "Frequency"
  }

  # -------------------------------------------------------------------
  # RETURN DATA
  # -------------------------------------------------------------------
  ## get all names needed if extended object should be computed and for combination
  all_names <- unique(unlist(lapply(rval, names)))

  ## get both objects on the same scale
  if (any(grepl("ymin|ymax|cofint_lwr|confint_upr", all_names))) {
    rval <- lapply(rval, summary.rootogram, style = style, scale = scale, extend = TRUE)
  } else {
    rval <- lapply(rval, summary.rootogram, style = style, scale = scale, extend = FALSE)
  }
  rval <- lapply(rval, as.data.frame) # remove inner class

  ## combine and return (fill up missing variables with NAs)
  rval <- do.call(
    "rbind.data.frame",
    c(lapply(
      rval,
      function(x) {
        data.frame(c(x, sapply(setdiff(all_names, names(x)), function(y) NA)))
      }
    ),
    make.row.names = FALSE
    )
  )

  ## add group
  n <- unlist(n)
  rval$group <- if (length(n) < 2L) NULL else rep.int(seq_along(n), n)

  ## add attributes
  attr(rval, "style") <- style
  attr(rval, "scale") <- scale
  attr(rval, "expected") <- expected
  attr(rval, "confint") <- confint
  attr(rval, "ref") <- ref
  attr(rval, "xlab") <- xlab
  attr(rval, "ylab") <- ylab
  attr(rval, "main") <- main

  ## set class to data.frame or tibble
  if (class == "data.frame") {
    class(rval) <- c("rootogram", "data.frame")
  } else {
    rval <- tibble::as_tibble(rval)
    class(rval) <- c("rootogram", class(rval))
  }

  ## return
  return(rval)
}


#' @export
rbind.rootogram <- c.rootogram


#' S3 Methods for Plotting Rootograms
#'
#' Generic plotting functions for rootograms of the class \code{"rootogram"}
#' computed by \code{link{rootogram}}.
#'
#' Rootograms graphically compare (square roots) of empirical frequencies with
#' expected (fitted) frequencies from a probability model.
#'
#' Rootograms graphically compare frequencies of empirical distributions and
#' expected (fitted) probability models. For the observed distribution the histogram is
#' drawn on a square root scale (hence the name) and superimposed with a line
#' for the expected frequencies. The histogram can be \code{"standing"} on the
#' x-axis (as usual), or \code{"hanging"} from the expected (fitted) curve, or a
#' \code{"suspended"} histogram of deviations can be drawn.
#'
#' @aliases plot.rootogram autoplot.rootogram
#'
#' @param x,object an object of class \code{\link{rootogram}}.
#' @param style character specifying the syle of rootogram.
#' @param scale character specifying whether raw frequencies or their square
#' roots (default) should be drawn.
#' @param expected Should the expected (fitted) frequencies be plotted?
#' @param ref logical. Should a reference line be plotted?
#' @param confint logical. Should confident intervals be drawn?
#' @param confint_level numeric. The confidence level required.
#' @param confint_type character. Should \code{"pointwise"} or \code{"simultaneous"} confidence intervals be visualized. 
#' @param confint_nrep numeric. The repetition number of simulation for computing the confidence intervals.
#' @param xlim,ylim,xlab,ylab,main,axes,box graphical parameters.
#' @param col,border,lwd,lty,alpha_min graphical parameters for the histogram style part of the base plot.
#' @param colour,fill,size,linetype,alpha graphical parameters for the histogram style part in the \code{autoplot}.
#' @param legend logical. Should a legend be added in the \code{ggplot2} style
#' graphic?
#' @param theme Which `ggplot2` theme should be used. If not set, \code{\link[ggplot2]{theme_bw}} is employed.
#' @param expected_col,expected_pch,expected_lty,expected_lwd,ref_col,ref_lty,ref_lwd,expected_colour,expected_size,expected_linetype,expected_alpha,expected_fill,expected_stroke,expected_shape,ref_colour,ref_size,ref_linetype,ref_alpha,confint_col,confint_lty,confint_lwd,confint_colour,confint_size,confint_linetype,confint_alpha Further graphical parameters for the `expected` and `ref` line using either \code{\link[ggplot2]{autoplot}} or \code{plot}.
#' @param \dots further graphical parameters passed to the plotting function.
#' @seealso \code{\link{rootogram}}, \code{\link{procast}}
#' @references Friendly M (2000), \emph{Visualizing Categorical Data}. SAS
#' Institute, Cary.
#'
#' Kleiber C, Zeileis A (2016).  \dQuote{Visualizing Count Data Regressions
#' Using Rootograms.} \emph{The American Statistician}, \bold{70}(3), 296--303.
#' c("\\Sexpr[results=rd,stage=build]{tools:::Rd_expr_doi(\"#1\")}",
#' "10.1080/00031305.2016.1173590")\Sexpr{tools:::Rd_expr_doi("10.1080/00031305.2016.1173590")}.
#'
#' Tukey JW (1977). \emph{Exploratory Data Analysis}. Addison-Wesley, Reading.
#' @keywords hplot
#' @examples
#'
#' ## speed and stopping distances of cars
#' m1_lm <- lm(dist ~ speed, data = cars)
#'
#' ## compute and plot rootogram
#' rootogram(m1_lm)
#'
#' ## customize colors
#' rootogram(m1_lm, ref_col = "blue", lty = 2, pch = 20)
#'
#' #-------------------------------------------------------------------------------
#' if (require("crch")) {
#'
#'   ## precipitation observations and forecasts for Innsbruck
#'   data("RainIbk", package = "crch")
#'   RainIbk <- sqrt(RainIbk)
#'   RainIbk$ensmean <- apply(RainIbk[, grep("^rainfc", names(RainIbk))], 1, mean)
#'   RainIbk$enssd <- apply(RainIbk[, grep("^rainfc", names(RainIbk))], 1, sd)
#'   RainIbk <- subset(RainIbk, enssd > 0)
#'
#'   ## linear model w/ constant variance estimation
#'   m2_lm <- lm(rain ~ ensmean, data = RainIbk)
#'
#'   ## logistic censored model
#'   m2_crch <- crch(rain ~ ensmean | log(enssd), data = RainIbk, left = 0, dist = "logistic")
#'
#'   ### compute rootograms FIXME
#'   #r2_lm <- rootogram(m2_lm, plot = FALSE)
#'   #r2_crch <- rootogram(m2_crch, plot = FALSE)
#'
#'   ### plot in single graph
#'   #plot(c(r2_lm, r2_crch), col = c(1, 2))
#' }
#'
#' #-------------------------------------------------------------------------------
#' ## determinants for male satellites to nesting horseshoe crabs
#' data("CrabSatellites", package = "countreg")
#'
#' ## linear poisson model
#' m3_pois <- glm(satellites ~ width + color, data = CrabSatellites, family = poisson)
#'
#' ## compute and plot rootogram as "ggplot2" graphic
#' rootogram(m3_pois, plot = "ggplot2")
#'
#' #-------------------------------------------------------------------------------
#' ## artificial data from negative binomial (mu = 3, theta = 2)
#' ## and Poisson (mu = 3) distribution
#' set.seed(1090)
#' y <- rnbinom(100, mu = 3, size = 2)
#' x <- rpois(100, lambda = 3)
#'
#' ## glm method: fitted values via glm()
#' m4_pois <- glm(y ~ x, family = poisson)
#'
#' ## correctly specified Poisson model fit
#' par(mfrow = c(1, 3))
#' r4a_pois <- rootogram(m4_pois, style = "standing", ylim = c(-2.2, 4.8), main = "Standing")
#' r4b_pois <- rootogram(m4_pois, style = "hanging", ylim = c(-2.2, 4.8), main = "Hanging")
#' r4c_pois <- rootogram(m4_pois, style = "suspended", ylim = c(-2.2, 4.8), main = "Suspended")
#' par(mfrow = c(1, 1))
#' @export
plot.rootogram <- function(x,
                           style = NULL,
                           scale = NULL,
                           expected = NULL,
                           ref = NULL,
                           confint = NULL,
                           confint_level = 0.95,
                           confint_type = c("pointwise", "simultaneous"),
                           confint_nrep = 1000,
                           xlim = c(NA, NA),
                           ylim = c(NA, NA),
                           xlab = NULL,
                           ylab = NULL,
                           main = NULL,
                           axes = TRUE,
                           box = FALSE,
                           col = "darkgray",
                           border = "black",
                           lwd = 1,
                           lty = 1,
                           alpha_min = 0.8,
                           expected_col = 2,
                           expected_pch = 19,
                           expected_lty = 1,
                           expected_lwd = 2,
                           confint_col = "black",
                           confint_lty = 2,
                           confint_lwd = 1.75,
                           ref_col = "black",
                           ref_lty = 1,
                           ref_lwd = 1.25,
                           ...) {
  # -------------------------------------------------------------------
  # SET UP PRELIMINARIES
  # -------------------------------------------------------------------
  ## check if ylab is defined
  ylab_missing <- missing(ylab)

  ## get default arguments
  style <- use_arg_from_attributes(x, "style", default = "hanging", force_single = TRUE)
  scale <- use_arg_from_attributes(x, "scale", default = "sqrt", force_single = TRUE)
  expected <- use_arg_from_attributes(x, "expected", default = TRUE, force_single = FALSE)
  ref <- use_arg_from_attributes(x, "ref", default = TRUE, force_single = FALSE)
  confint <- use_arg_from_attributes(x, "confint", default = TRUE, force_single = FALSE)

  ## sanity checks
  ## * lengths of most arguments are checked by recycling
  ## * graphical parameters are checked w/i function calls for plotting
  stopifnot(is.logical(ref))
  stopifnot(is.logical(axes))
  stopifnot(is.logical(box))
  stopifnot(all(sapply(xlim, function(x) is.numeric(x) || is.na(x))))
  stopifnot(all(sapply(ylim, function(x) is.numeric(x) || is.na(x))))
  stopifnot(is.logical(confint))
  stopifnot(
    is.numeric(confint_level),
    length(confint_level) == 1,
    confint_level >= 0,
    confint_level <= 1
  )
  stopifnot(
    is.numeric(confint_nrep),
    length(confint_nrep) == 1,
    confint_nrep >= 0
  )

  ## match arguments
  scale <- match.arg(scale, c("sqrt", "raw"))
  style <- match.arg(style, c("hanging", "standing", "suspended"))
  confint_type <- match.arg(confint_type)

  ## extend input object on correct scale (compute heights, ...)
  x <- summary(
    x, 
    scale = scale, 
    style = style,
    confint_level = confint_level,
    confint_type = confint_type,
    confint_nrep = confint_nrep
  )

  ## convert always to data.frame
  x <- as.data.frame(x)

  ## handling of groups
  if (is.null(x$group)) x$group <- 1L
  n <- max(x$group)

  # -------------------------------------------------------------------
  # PREPARE AND DEFINE ARGUMENTS FOR PLOTTING
  # -------------------------------------------------------------------
  ## determine in which style the `expected` should be plotted
  expected[expected == FALSE | expected == "FALSE"] <- "none"
  expected[expected == TRUE | expected == "TRUE"] <- "both"
  expected <- c("none", "b", "l", "p")[match(expected, c("none", "both", "line", "point"))]
  stopifnot(all(expected %in% c("none", "b", "l", "p")))

  ## prepare xlim and ylim
  if (is.list(xlim)) xlim <- as.data.frame(do.call("rbind", xlim))
  if (is.list(ylim)) ylim <- as.data.frame(do.call("rbind", ylim))

  ## recycle arguments for plotting to match the number of groups
  plot_arg <- data.frame(1:n, expected, ref, confint,
    xlim1 = xlim[[1]], xlim2 = xlim[[2]], ylim1 = ylim[[1]], ylim2 = ylim[[2]],
    axes, box,
    border, col, lwd, lty, alpha_min,
    expected_col, expected_pch, expected_lty, expected_lwd, 
    ref_col, ref_lty, ref_lwd,
    confint_col, confint_lty, confint_lwd
  )[, -1]

  ## prepare annotation
  xlab <- use_arg_from_attributes(x, "xlab", default = "Rootogram", force_single = FALSE)
  ylab <- use_arg_from_attributes(x, "ylab",
    default = if (scale == "raw") "Frequency" else "sqrt(Frequency)", force_single = FALSE
  )
  main <- use_arg_from_attributes(x, "main", default = "model", force_single = FALSE)

  ## fix `ylab` according to possible new `freq`
  if (ylab_missing && scale == "sqrt") {
    ylab[grepl("^Frequency$", ylab)] <- "sqrt(Frequency)"
  } else if (ylab_missing && scale == "raw") {
    ylab[grepl("^sqrt\\(Frequency\\)$", ylab)] <- "Frequency"
  }

  # -------------------------------------------------------------------
  # PREPARE DATA FOR PLOTTING
  # -------------------------------------------------------------------

  # -------------------------------------------------------------------
  # MAIN PLOTTING FUNCTION
  # -------------------------------------------------------------------
  rootogram_plot <- function(d, ...) {

    ## get group index
    j <- unique(d$group)

    ## rect elements
    ybottom <- d$ymin
    ytop <- d$ymax
    xleft <- d$mid - d$width / 2
    xright <- d$mid + d$width / 2

    ## step elements (only needed for expected, confint)
    z <- compute_breaks(d$mid, d$width, offset = TRUE)

    ## get xlim and ylim
    ylim_idx <- c(is.na(plot_arg$ylim1[j]), is.na(plot_arg$ylim2[j]))
    xlim_idx <- c(is.na(plot_arg$xlim1[j]), is.na(plot_arg$xlim2[j]))
    if (any(xlim_idx)) {
      plot_arg[j, c("xlim1", "xlim2")[xlim_idx]] <- range(c(xleft, xright))[xlim_idx]
    }
    if (any(ylim_idx)) {
      ylim_use <- rep(
        c(TRUE, TRUE, plot_arg$expected[j] != "none", plot_arg$confint[j], plot_arg$confint[j]),
          each = NROW(d)
      )
      plot_arg[j, c("ylim1", "ylim2")[ylim_idx]] <- range(
        c(d$ymin, d$ymax, d$expected, d$confint_lwr, d$confint_upr)[ylim_use],
        na.rm = TRUE
      )[ylim_idx]
    }

    ## trigger plot
    plot(0, 0,
      type = "n", xlim = c(plot_arg$xlim1[j], plot_arg$xlim2[j]),
      ylim = c(plot_arg$ylim1[j], plot_arg$ylim2[j]),
      xlab = xlab[j], ylab = ylab[j], main = main[j], axes = FALSE, ...
    )
    if (plot_arg$axes[j]) {
      axis(1)
      axis(2)
    }
    if (plot_arg$box[j]) {
      box()
    }

    ## plot rootogram
    rect(xleft, ybottom, xright, ytop,
      col = set_minimum_transparency(plot_arg$col[j], alpha_min = plot_arg$alpha_min[j]),
      border = plot_arg$border[j],
      lty = plot_arg$lty[j],
      lwd = plot_arg$lwd[j]
    )

    ## add ref line
    if (plot_arg$ref[j]) {
      abline(
        h = 0,
        col = plot_arg$ref_col[j],
        lty = plot_arg$ref_lty[j],
        lwd = plot_arg$ref_lwd[j]
      )
    }

    ## add expected line
    if (plot_arg$expected[j] != "none") {
      lines(
        d$mid,
        d$expected,
        col =  plot_arg$expected_col[j],
        pch = plot_arg$expected_pch[j],
        type = plot_arg$expected[j],
        lty = plot_arg$expected_lty[j],
        lwd = plot_arg$expected_lwd[j]
      )
    }

    if (plot_arg$confint[j]) {
      ## lower confint line
      confint_lwr_y <- duplicate_last_value(d$confint_lwr)
      lines(
        confint_lwr_y ~ z,
        type = "s",
        col = plot_arg$confint_col[j],
        lty = plot_arg$confint_lty[j],
        lwd = plot_arg$confint_lwd[j]
      )

      ## upper confint line
      confint_upr_y <- duplicate_last_value(d$confint_upr)
      lines(
        confint_upr_y ~ z,
        type = "s",
        col = plot_arg$confint_col[j],
        lty = plot_arg$confint_lty[j],
        lwd = plot_arg$confint_lwd[j]
      )
    }
  }

  # -------------------------------------------------------------------
  # DRAW PLOTS
  # -------------------------------------------------------------------
  ## set up necessary panels
  if (n > 1L) {
    old_pars <- par(mfrow = n2mfrow(n))
    on.exit(par(old_pars), add = TRUE)
  }

  ## draw rootograms
  for (i in 1L:n) rootogram_plot(x[x$group == i, ], ...)
}


#' @rdname plot.rootogram
#' @method autoplot rootogram
#' @exportS3Method ggplot2::autoplot rootogram
autoplot.rootogram <- function(object,
                               style = NULL,
                               scale = NULL,
                               expected = NULL,
                               ref = NULL,
                               confint = NULL,
                               confint_level = 0.95,
                               confint_type = c("pointwise", "simultaneous"),
                               confint_nrep = 1000,
                               xlim = c(NA, NA),
                               ylim = c(NA, NA),
                               xlab = NULL,
                               ylab = NULL,
                               main = NULL,
                               legend = FALSE,
                               theme = NULL,
                               colour = "black",
                               fill = "darkgray",
                               size = 0.5,
                               linetype = 1,
                               alpha = NA,
                               expected_colour = 2,
                               expected_size = 1,
                               expected_linetype = 1,
                               expected_alpha = 1,
                               expected_fill = NA,
                               expected_stroke = 0.5,
                               expected_shape = 19,
                               confint_colour = "black",
                               confint_size = 0.5,
                               confint_linetype = 2,
                               confint_alpha = NA,
                               ref_colour = "black",
                               ref_size = 0.5,
                               ref_linetype = 1,
                               ref_alpha = NA,
                               ...) {
  # -------------------------------------------------------------------
  # SET UP PRELIMINARIES
  # -------------------------------------------------------------------
  ## check if ylab is defined
  ylab_missing <- missing(ylab)

  ## get default arguments
  style <- use_arg_from_attributes(object, "style", default = "hanging", force_single = TRUE)
  scale <- use_arg_from_attributes(object, "scale", default = "sqrt", force_single = TRUE)
  expected <- use_arg_from_attributes(object, "expected", default = TRUE, force_single = TRUE)
  ref <- use_arg_from_attributes(object, "ref", default = TRUE, force_single = TRUE)
  confint <- use_arg_from_attributes(object, "confint", default = TRUE, force_single = TRUE)
  xlab <- use_arg_from_attributes(object, "xlab", default = "Rootogram", force_single = TRUE)
  ylab <- use_arg_from_attributes(object, "ylab",
    default = if (scale == "raw") "Frequency" else "sqrt(Frequency)", force_single = TRUE
  )

  ## fix `ylabel` according to possible new `scale`
  if (ylab_missing) {
    if (scale == "sqrt" && grepl("^Frequency$", ylab)) ylab <- "sqrt(Frequency)"
    if (scale == "raw" && grepl("^sqrt\\(Frequency\\)$", ylab)) ylab <- "Frequency"
  }

  ## get base style arguments
  add_arg <- list(...)
  if (!is.null(add_arg$lwd)) size <- add_arg$lwd
  if (!is.null(add_arg$lty)) linetype <- add_arg$lty

  ## sanity checks
  stopifnot(is.logical(ref))
  stopifnot(all(sapply(xlim, function(x) is.numeric(x) || is.na(x))))
  stopifnot(all(sapply(ylim, function(x) is.numeric(x) || is.na(x))))
  stopifnot(is.logical(confint))
  stopifnot(
    is.numeric(confint_level),
    length(confint_level) == 1,
    confint_level >= 0,
    confint_level <= 1
  )
  stopifnot(
    is.numeric(confint_nrep),
    length(confint_nrep) == 1,
    confint_nrep >= 0
  )

  ## match arguments
  confint_type <- match.arg(confint_type)

  ## get line style for expected
  if (isFALSE(expected)) {
    expected <- "none"
  } else if (isTRUE(expected)) {
    expected <- "both"
  }
  expected <- match.arg(expected, c("none", "both", "line", "point"))

  ## transform to correct scale
  object <- summary.rootogram(object, scale = scale, style = style, extend = FALSE)

  ## convert data always to data.frame
  object <- as.data.frame(object)

  ## determine grouping
  if (is.null(object$group)) object$group <- 1L
  n <- max(object$group)

  ## get title (must be done before handling of `main`)
  if (!is.null(main)) {
    title <- main[1]
    object$title <- factor(title)
  }

  ## get main and transform to the right length (must be done after handling of `title`)
  main <- use_arg_from_attributes(object, "main", default = "model", force_single = FALSE)
  stopifnot(is.character(main))
  main <- make.names(rep_len(main, n), unique = TRUE)

  ## prepare grouping
  object$group <- factor(object$group, levels = 1L:n, labels = main)

  # -------------------------------------------------------------------
  # PREPARE AND DEFINE ARGUMENTS FOR PLOTTING
  # -------------------------------------------------------------------
  ## recycle arguments for plotting to match the number of groups (for `scale_<...>_manual()`)
  plot_arg <- data.frame(
    1:n,
    colour, fill, size, linetype, alpha
  )[, -1]

  # -------------------------------------------------------------------
  # MAIN PLOTTING
  # -------------------------------------------------------------------
  ## actual plotting
  rval <- ggplot2::ggplot(
    object,
    ggplot2::aes_string(
      observed = "observed",
      expected = "expected",
      mid = "mid",
      width = "width",
      group = "group"
    )
  )

  ## add rootogram
  rval <- rval +
    geom_rootogram(
      ggplot2::aes_string(
        colour = "group",
        fill = "group",
        size = "group",
        linetype = "group",
        alpha = "group"
      ),
      scale = "raw",
      style = style
    )

  ## add expected line
  if (expected != "none") {
    rval <- rval +
      geom_rootogram_expected(
        scale = "raw",
        linestyle = expected,
        colour = expected_colour,
        size = expected_size,
        linetype = expected_linetype,
        alpha = expected_alpha,
        fill = expected_fill,
        stroke = expected_stroke,
        shape = expected_shape,
      )
  }

  ## add ref
  if (ref) {
    rval <- rval +
      geom_rootogram_ref(
        colour = ref_colour,
        size = ref_size,
        linetype = ref_linetype,
        alpha = ref_alpha,
      )
  }

  ## add confint
  if (confint) { 
    rval <- rval +
      geom_rootogram_confint(
        level = confint_level,
        type = confint_type,
        scale = scale,
        rootogram_style = style,
        colour = confint_colour,
        size = confint_size,
        linetype = confint_linetype,
        alpha = confint_alpha
      )
  }

  ## set the colors, shapes, etc.
  rval <- rval +
    ggplot2::scale_colour_manual(values = plot_arg$colour) +
    ggplot2::scale_fill_manual(values = plot_arg$fill) +
    ggplot2::scale_size_manual(values = plot_arg$size) +
    ggplot2::scale_linetype_manual(values = plot_arg$linetype) +
    ggplot2::scale_alpha_manual(values = plot_arg$alpha)

  ## annotation
  rval <- rval + ggplot2::xlab(xlab) + ggplot2::ylab(ylab)

  ## add legend
  if (legend) {
    rval <- rval +
      ggplot2::labs(
        colour = "Model",
        fill = "Model",
        size = "Model",
        linetype = "Model",
        alpha = "Model"
      ) +
      ggplot2::guides(
        colour = "legend",
        fill = "legend",
        size = "legend",
        linetype = "legend",
        alpha = "legend"
      )
  } else {
    rval <- rval +
      ggplot2::guides(
        colour = "none",
        fill = "none",
        size = "none",
        linetype = "none",
        alpha = "none"
      )
  }

  ## set x and y limits
  rval <- rval +
    ggplot2::coord_cartesian(xlim = xlim, ylim = ylim, expand = TRUE) +
    ggplot2::scale_x_continuous(expand = c(0.01, 0.01)) +
    ggplot2::scale_y_continuous(expand = c(0.01, 0.01))

  ## set ggplot2 theme
  if (is.character(theme)) {
    theme_tmp <- try(eval(parse(text = theme)), silent = TRUE)
    if (inherits(theme_tmp, "try-error") && !grepl("^ggplot2::", theme)) {
      theme_tmp <- try(eval(parse(text = paste0("ggplot2::", theme))), silent = TRUE)
    }
    theme <- theme_tmp
    if (!is.function(theme)) {
        warning("The argument `theme` must be a `ggplot2` theme, a theme-generating function or a valid 'character string'.", call. = FALSE)
        theme <- ggplot2::theme_bw()
    }
  }

  if (is.function(theme)) {
    theme <- try(theme(), silent = TRUE)
    if (inherits(theme, "try-error") || !inherits(theme, "theme")) {
        warning("The argument `theme` must be a `ggplot2` theme, a theme-generating function or a valid 'character string'.", call. = FALSE)
        theme <- ggplot2::theme_bw()
    }
  }

  if (inherits(theme, "theme")) {
    rval <- rval + theme
  } else if (isTRUE(all.equal(ggplot2::theme_get(), ggplot2::theme_gray()))) {
    rval <- rval + ggplot2::theme_bw()
  }

  # -------------------------------------------------------------------
  # GROUPING (IF ANY) AND RETURN PLOT
  # -------------------------------------------------------------------
  ## grouping
  if (n > 1L) {
    rval <- rval + ggplot2::facet_grid(group ~ .)
  } else if (!is.null(object$title)) {
    rval <- rval + ggplot2::facet_wrap(title ~ .)
  }

  ## return ggplot object
  rval
}


# -------------------------------------------------------------------
# GGPLOT2 IMPLEMENTATIONS FOR `geom_rootogram()`
# -------------------------------------------------------------------

#' @rdname geom_rootogram
#' @export
stat_rootogram <- function(mapping = NULL,
                           data = NULL,
                           geom = "rootogram",
                           position = "identity",
                           na.rm = FALSE,
                           show.legend = NA,
                           inherit.aes = TRUE,
                           scale = c("sqrt", "raw"),
                           style = c("hanging", "standing", "suspended"),
                           ...) {
  scale <- match.arg(scale)
  style <- match.arg(style)

  ggplot2::layer(
    stat = StatRootogram,
    data = data,
    mapping = mapping,
    geom = geom,
    position = position,
    show.legend = show.legend,
    inherit.aes = inherit.aes,
    params = list(
      na.rm = na.rm,
      scale = scale,
      style = style,
      ...
    )
  )
}


#' @rdname geom_rootogram
#' @format NULL
#' @usage NULL
#' @export
StatRootogram <- ggplot2::ggproto("StatRootogram", ggplot2::Stat,
  required_aes = c("observed", "expected", "mid", "width"),
  compute_group = function(data,
                           scales,
                           scale = c("sqrt", "raw"),
                           style = c("hanging", "standing", "suspended")) {
    scale <- match.arg(scale)
    style <- match.arg(style)

    tmp_heights <- compute_rootogram_heights(data$expected, data$observed, scale = scale, style = style)

    data <- transform(data,
      xmin = mid - width / 2,
      xmax = mid + width / 2,
      ymin = tmp_heights$ymin,
      ymax = tmp_heights$ymax,
      observed = NULL,
      expected = NULL,
      mid = NULL,
      width = NULL
    )
    data
  }
)


#' \code{geom_*} and \code{stat_*} for Producing PIT Histograms with `ggplot2`
#'
#' Various \code{geom_*} and \code{stat_*} used within
#' \code{\link[ggplot2]{autoplot}} for producing PIT histograms.
#'
#' @inheritParams ggplot2::layer
#' @inheritParams ggplot2::geom_point
#' @param style character specifying the syle of rootogram (see below).
#' @param scale character specifying whether values should be transformed to the square root scale (not checking for original scale, so maybe applied again).
#' @param linestyle Character string defining one of `"both"`, `"line"` or `"point"`.
#' @param level numeric. The confidence level required.
#' @param type character. Should \code{"pointwise"} or \code{"simultaneous"} confidence intervals be visualized. 
#' @param nrep numeric. The repetition number of simulation for computing the confidence intervals.
#' @param rootogram_style character specifying the syle of rootogram.
#' @examples
#' if (require("ggplot2")) {
#'   ## Fit model
#'   data("CrabSatellites", package = "countreg")
#'   m1_pois <- glm(satellites ~ width + color, data = CrabSatellites, family = poisson)
#'   m2_pois <- glm(satellites ~ color, data = CrabSatellites, family = poisson)
#'
#'   ## Compute rootogram (on raw scale)
#'   p1 <- rootogram(m1_pois, scale = "raw", plot = FALSE)
#'   p2 <- rootogram(m2_pois, scale = "raw", plot = FALSE)
#'
#'   d <- c(p1, p2)
#'
#'   ## Get label names
#'   main <- attr(d, "main")
#'   main <- make.names(main, unique = TRUE)
#'   d$group <- factor(d$group, labels = main)
#'
#'   ## Plot rootograms w/ on default "sqrt" scale
#'   gg1 <- ggplot(data = d) +
#'     geom_rootogram(aes(
#'       observed = observed, expected = expected, mid = mid,
#'       width = width, group = group
#'     )) +
#'     geom_rootogram_expected(aes(expected = expected, mid = mid)) +
#'     geom_rootogram_ref() +
#'     facet_grid(group ~ .) + 
#'     xlab("satellites") +
#'     ylab("sqrt(Frequency)")
#'   gg1
#' }
#' @export
geom_rootogram <- function(mapping = NULL,
                           data = NULL,
                           stat = "rootogram",
                           position = "identity",
                           na.rm = FALSE,
                           show.legend = NA,
                           inherit.aes = TRUE,
                           scale = c("sqrt", "raw"),
                           style = c("hanging", "standing", "suspended"),
                           ...) {
  scale <- match.arg(scale)
  style <- match.arg(style)

  ggplot2::layer(
    geom = GeomRootogram,
    mapping = mapping,
    data = data,
    stat = stat,
    position = position,
    show.legend = show.legend,
    inherit.aes = inherit.aes,
    params = list(
      na.rm = na.rm,
      scale = scale,
      style = style,
      ...
    )
  )
}


#' @rdname geom_rootogram
#' @format NULL
#' @usage NULL
#' @export
GeomRootogram <- ggplot2::ggproto("GeomRootogram", ggplot2::GeomRect,
  default_aes = ggplot2::aes(
    colour = "black", fill = "darkgray", size = 0.5, linetype = 1,
    alpha = NA
  ),
  required_aes = c("xmin", "xmax", "ymin", "ymax")
)


# -------------------------------------------------------------------
# GGPLOT2 IMPLEMENTATIONS FOR `geom_rootogram_expected()`
# -------------------------------------------------------------------

#' @rdname geom_rootogram
#' @export
stat_rootogram_expected <- function(mapping = NULL,
                                  data = NULL,
                                  geom = "rootogram_expected",
                                  position = "identity",
                                  na.rm = FALSE,
                                  show.legend = NA,
                                  inherit.aes = TRUE,
                                  scale = c("sqrt", "raw"),
                                  ...) {
  scale <- match.arg(scale)
  style <- match.arg(style)

  ggplot2::layer(
    stat = StatRootogramExpected,
    data = data,
    mapping = mapping,
    geom = geom,
    position = position,
    show.legend = show.legend,
    inherit.aes = inherit.aes,
    params = list(
      na.rm = na.rm,
      scale = scale,
      ...
    )
  )
}


#' @rdname geom_rootogram
#' @format NULL
#' @usage NULL
#' @export
StatRootogramExpected <- ggplot2::ggproto("StatRootogramExpected", ggplot2::Stat,
  required_aes = c("expected", "mid"),
  compute_group = function(data,
                           scales,
                           scale = c("sqrt", "raw")) {
    scale <- match.arg(scale)

    ## raw vs. sqrt scale
    if (scale == "sqrt") {
      data <- transform(data,
        x = mid,
        y = sqrt(expected),
        expected = NULL,
        mid = NULL
      )
    } else {
      data <- transform(data,
        x = mid,
        y = expected,
        expected = NULL,
        mid = NULL
      )
    }
    data
  }
)

#' @rdname geom_rootogram
#' @export
geom_rootogram_expected <- function(mapping = NULL,
                                  data = NULL,
                                  stat = "rootogram_expected",
                                  position = "identity",
                                  na.rm = FALSE,
                                  show.legend = NA,
                                  inherit.aes = TRUE,
                                  scale = c("sqrt", "raw"),
                                  linestyle = c("both", "line", "point"),
                                  ...) {
  scale <- match.arg(scale)
  linestyle <- match.arg(linestyle)

  ggplot2::layer(
    geom = GeomRootogramExpected,
    mapping = mapping,
    data = data,
    stat = stat,
    position = position,
    show.legend = show.legend,
    inherit.aes = inherit.aes,
    params = list(
      na.rm = na.rm,
      scale = scale,
      linestyle = linestyle,
      ...
    )
  )
}


#' @rdname geom_rootogram
#' @export
GeomRootogramExpected <- ggplot2::ggproto("GeomRootogramExpected", ggplot2::GeomPath,
  default_aes = ggplot2::aes(
    colour = 2, size = 1, linetype = 1,
    alpha = 1, fill = NA, stroke = 0.5, shape = 19
  ),
  draw_panel = function(data, panel_params, coord, arrow = NULL,
                        lineend = "butt", linejoin = "round", linemitre = 10,
                        na.rm = FALSE,
                        linestyle = c("both", "line", "point")) {
    linestyle <- match.arg(linestyle)

    if (linestyle == "both") {
      ## TODO: (ML) Do not copy data.
      data2 <- transform(data, size = size * 2)

      grid::grobTree(
        ggplot2::GeomPath$draw_panel(data, panel_params, coord, arrow = NULL,
          lineend = "butt", linejoin = "round", linemitre = 10,
          na.rm = FALSE
        ),
        ggplot2::GeomPoint$draw_panel(data2, panel_params, coord, na.rm = FALSE)
      )
    } else if (linestyle == "line") {
      ggplot2::GeomPath$draw_panel(data, panel_params, coord, arrow = NULL,
        lineend = "butt", linejoin = "round", linemitre = 10,
        na.rm = FALSE
      )
    } else {
      data <- transform(data, size = size * 2)
      ggplot2::GeomPoint$draw_panel(data, panel_params, coord, na.rm = FALSE)
    }
  }
)


# -------------------------------------------------------------------
# GGPLOT2 IMPLEMENTATIONS FOR `geom_rootogram_ref()`
# -------------------------------------------------------------------

#' @rdname geom_rootogram
#' @export
geom_rootogram_ref <- function(mapping = NULL,
                               data = NULL,
                               stat = "identity",
                               position = "identity",
                               na.rm = FALSE,
                               show.legend = NA,
                               inherit.aes = TRUE,
                               ...) {
  ggplot2::layer(
    geom = GeomRootogramRef,
    mapping = mapping,
    data = data,
    stat = stat,
    position = position,
    show.legend = show.legend,
    inherit.aes = inherit.aes,
    params = list(
      na.rm = na.rm,
      ...
    )
  )
}


#' @rdname geom_rootogram
#' @format NULL
#' @usage NULL
#' @export
GeomRootogramRef <- ggplot2::ggproto("GeomRootogramRef", ggplot2::GeomHline,
  default_aes = ggplot2::aes(
    colour = "black",
    size = 0.5,
    linetype = 1,
    alpha = NA,
    yintercept = 0
  ),
  required_aes = NULL
)

# -------------------------------------------------------------------
# GGPLOT2 IMPLEMENTATIONS FOR `geom_rootogram_confint()`
# -------------------------------------------------------------------

#' @rdname geom_rootogram
#' @export
stat_rootogram_confint <- function(mapping = NULL,
                                   data = NULL,
                                   geom = "rootogram_confint",
                                   position = "identity",
                                   na.rm = FALSE,
                                   show.legend = NA,
                                   inherit.aes = TRUE,
                                   level = 0.95,
                                   nrep = 1000,
                                   type = c("pointwise", "simultaneous"),
                                   scale = c("sqrt", "raw"),
                                   rootogram_style =  c("hanging", "standing", "suspended"),
                                   ...) {
  type <- match.arg(type)
  scale <- match.arg(scale)
  rootogram_style <- match.arg(rootogram_style)

  ggplot2::layer(
    stat = StatRootogramConfint,
    mapping = mapping,
    data = data,
    geom = geom,
    position = position,
    show.legend = show.legend,
    inherit.aes = inherit.aes,
    params = list(
      na.rm = na.rm,
      level = level,
      nrep = nrep,
      type = type,
      scale = scale,
      rootogram_style = rootogram_style,
      ...
    )
  )
}


#' @rdname geom_rootogram
#' @format NULL
#' @usage NULL
#' @export
StatRootogramConfint <- ggplot2::ggproto("StatRootogramConfint", ggplot2::Stat,
  required_aes = c("observed", "expected", "mid", "width"),
  compute_group = function(data,
                           scales,
                           level = 0.95,
                           nrep = 1000,
                           type = "pointwise",
                           scale = "sqrt",
                           rootogram_style = "hanging") {

    ## compute ci interval
    ci <- compute_rootogram_confint(
      observed = data$observed,
      expected = data$expected,
      mid = data$mid,
      width = data$width,
      level = level,
      nrep = nrep,
      type = type,
      scale = scale,
      style = rootogram_style
    )

    ## return new data.frame condition on plotting `style`
    nd <- data.frame(
      x = compute_breaks(data$mid, data$width, offset = TRUE),
      ymin = duplicate_last_value(ci[[1]]),
      ymax = duplicate_last_value(ci[[2]])
    )
    nd
  }
)


#' @rdname geom_rootogram
#' @export
geom_rootogram_confint <- function(mapping = NULL,
                                   data = NULL,
                                   stat = "rootogram_confint",
                                   position = "identity",
                                   na.rm = FALSE,
                                   show.legend = NA,
                                   inherit.aes = TRUE,
                                   level = 0.95,
                                   nrep = 1000,
                                   type = c("pointwise", "simultaneous"),
                                   scale = c("sqrt", "raw"),
                                   rootogram_style =  c("hanging", "standing", "suspended"),
                                   ...) {
  type <- match.arg(type)
  scale <- match.arg(scale)
  rootogram_style <- match.arg(rootogram_style)

  ggplot2::layer(
    geom = GeomRootogramConfint,
    mapping = mapping,
    data = data,
    stat = stat,
    position = position,
    show.legend = show.legend,
    inherit.aes = inherit.aes,
    params = list(
      na.rm = na.rm,
      level = level,
      nrep = nrep,
      type = type,
      scale = scale,
      rootogram_style = rootogram_style,
      ...
    )
  )
}


#' @rdname geom_rootogram
#' @format NULL
#' @usage NULL
#' @export
GeomRootogramConfint <- ggplot2::ggproto("GeomRootogramConfint", ggplot2::Geom,
  required_aes = c("x", "ymin", "ymax"),

  extra_params = c("na.rm"),

  # TODO: (ML) Does not vary for style; this is a copy of `GeomPolygon$handle_na()`
  handle_na = function(data, params) {
    data
  },

  ## Setting up all defaults needed for `GeomPolygon` and `GeomStep`
  default_aes = ggplot2::aes(
    colour = "black",
    size = 0.5,
    linetype = 2,
    alpha = NA,
    yintercept = 0
  ),

  draw_panel = function(data, panel_params, coord,
                        linejoin = "mitre", direction = "hv") {

    ## Join two Grobs
    data1 <- transform(data,
      y = ymin
    )
    data2 <- transform(data,
      y = ymax
    )
    grid::grobTree(
      ggplot2::GeomStep$draw_panel(data1, panel_params, coord, direction),
      ggplot2::GeomStep$draw_panel(data2, panel_params, coord, direction)
    )
  },
  draw_key = function(data, params, size) {
    draw_key_path(data, params, size)
  }
)


# -------------------------------------------------------------------
# HELPER FUNCTIONS FOR GETTING AN EXTENDED ROOTOGRAM OBJECT
# -------------------------------------------------------------------
compute_rootogram_confint <- function(object,
                                      observed,
                                      expected,
                                      mid,
                                      width,
                                      level = 0.95,
                                      nrep = 1000,
                                      type = c("pointwise", "simultaneous"),
                                      scale = c("sqrt", "raw"),
                                      style = c("hanging", "standing", "suspended")) {

  ## checks
  scale <- match.arg(scale)
  type <- match.arg(type)
  style <- match.arg(style)
  stopifnot(is.numeric(level), length(level) == 1, level >= 0, level <= 1)
  stopifnot(is.numeric(nrep), length(nrep) == 1, nrep >= 0)

  ## transform back to "raw"
  if (!missing(object)) {
    object <- summary(object, scale = "raw", style = style, extend = FALSE)
  } else if (!missing(observed) & !missing(expected) & !missing(mid) & !missing(width)) {
    object <- if (scale == "raw") {
      data.frame(observed, expected, mid, width) 
    } else {
      data.frame(observed^2, expected^2, mid, width)
    }
  } else {
    stop("No appropriate input data is given.")
  }

  ## helper function to compute one observed table (on input scale)
  ytab <- function(rgram) {
    y <- sample(rgram$mid, sum(rgram$observed), prob = rgram$expected,
      replace = TRUE)
    table(factor(y, levels = rgram$mid))
  }
  ## repeat nrep times
  ytab <- replicate(nrep, ytab(object))

  ## compute CIs
  if (scale == "sqrt") {
    if (type == "pointwise") {
      yq <- apply(sqrt(object$expected) - sqrt(ytab), 1, quantile, c((1 - level) / 2, 1 - (1 - level) / 2))
    } else {
      yq <- rbind(
        rep.int(quantile(apply(sqrt(object$expected) - sqrt(ytab), 2, min), (1 - level) / 2), nrow(object)),
        rep.int(quantile(apply(sqrt(object$expected) - sqrt(ytab), 2, max), 1 - (1 - level) / 2), nrow(object))
      )
    }

    if (style == "standing") yq <- rep(sqrt(object$expected), each = 2) + yq
  } else {
    if (type == "pointwise") {
      yq <- apply(object$expected - ytab, 1, quantile, c((1 - level) / 2, 1 - (1 - level) / 2))
    } else {
      yq <- rbind(
        rep.int(quantile(apply(object$expected - ytab, 2, min), (1 - level) / 2), nrow(object)),
        rep.int(quantile(apply(object$expected - ytab, 2, max), 1 - (1 - level) / 2), nrow(object))
      )
    }
    if (style == "standing") yq <- rep(object$expected, each = 2) + yq
  }

  ## return
  df <- data.frame(
    confint_lwr = yq[1, ],
    confint_upr = yq[2, ]
  )

  df
  
}

compute_rootogram_heights <- function(expected,
                                      observed,
                                      scale = c("sqrt", "raw"),
                                      style = c("hanging", "standing", "suspended")) {
  scale <- match.arg(scale)
  style <- match.arg(style)

  ## raw vs. sqrt scale
  if (scale == "sqrt") {
    y <- if (style == "hanging") sqrt(expected) - sqrt(observed) else 0
    height <- if (style == "suspended") sqrt(expected) - sqrt(observed) else sqrt(observed)
  } else {
    y <- if (style == "hanging") expected - observed else 0
    height <- if (style == "suspended") expected - observed else observed
  }

  data.frame(
    ymin = y,
    ymax = y + height
  )
}


#' @export
summary.rootogram <- function(object,
                              scale = NULL,
                              style = NULL,
                              confint_level = 0.95,
                              confint_type = c("pointwise", "simultaneous"),
                              confint_nrep = 1000,
                              extend = TRUE,
                              ...) {


  stopifnot(is.logical(extend))

  ## get arg `style` and `scale`
  scale_object <- attr(object, "scale")
  scale <- use_arg_from_attributes(object, "scale", default = "sqrt", force_single = TRUE)
  style <- use_arg_from_attributes(object, "style", default = "hanging", force_single = TRUE)

  scale <- match.arg(scale, c("sqrt", "raw"))
  style <- match.arg(style, c("hanging", "standing", "suspended"))

  if (scale != scale_object && scale_object == "raw") {
    object <- transform(object,
      observed = sqrt(object$observed),
      expected = sqrt(object$expected)
    )
  } else if (scale != scale_object && scale_object == "sqrt") {
    object <- transform(object,
      observed = object$observed^2,
      expected = object$expected^2
    )
  } else if (scale != scale_object) {
    stop('attribute `scale` must be unique and must match one of c("sqrt", "raw")`')
  }

  if (extend) {
    ## compute heights (must always be `scale = "raw"` as already transformed)
    tmp <- compute_rootogram_heights(object$expected, object$observed, scale = "raw", style = style)

    ## compute confidence intervals (must be done per each group)
    if (!any(grepl("group", names(object)))) {
      tmp2 <- compute_rootogram_confint(
        object, 
        level = confint_level, 
        type = confint_type,
        nrep = confint_nrep, 
        scale = scale, 
        style = style
      )
    } else {
      tmp2 <- list()
      for (i in unique(object$group)) {
        tmp2[[i]] <- compute_rootogram_confint(
          object[object$group == i, ], 
          level = confint_level, 
          type = confint_type,
          nrep = confint_nrep, 
          scale = scale, 
          style = style
        )
      }
      tmp2 <- do.call("rbind", tmp2)
    }

    rval <- transform(object,
      ymin = tmp$ymin,
      ymax = tmp$ymax,
      confint_lwr = tmp2$confint_lwr,
      confint_upr = tmp2$confint_upr
    )

  } else {
    rval <- object
  }

  ## set attributes
  attr(rval, "style") <- style
  attr(rval, "scale") <- scale
  attr(rval, "xlab") <- attr(object, "xlab")
  attr(rval, "ylab") <- attr(object, "ylab")
  attr(rval, "main") <- attr(object, "main")

  ## return as `data.frame` or `tibble`
  if ("data.frame" %in% class(object)) {
    class(rval) <- c("rootogram", "data.frame")
  } else {
    rval <- tibble::as_tibble(rval)
    class(rval) <- c("rootogram", class(rval))
  }

  rval
}

#' @export
print.rootogram <- function(x, ...) {

  ## get arg `style` and `scale`
  scale <- style <- NULL # needed for `use_arg_from_attributes()` # FIXME: (ML) Still needed?!
  style <- use_arg_from_attributes(x, "style", default = NULL, force_single = TRUE)
  scale <- use_arg_from_attributes(x, "scale", default = NULL, force_single = TRUE)

  ## return custom print statement
  if (is.null(scale) || is.null(style)) {
    cat("A `rootogram` object without mandatory attributes `scale` and `style`\n\n")
  } else if (all(c("ymin", "ymax") %in% names(x))) {
    cat(
      paste0(
        sprintf(
          "A `rootogram` object with `scale = \"%s\"` and `style = \"%s\"`",
          scale,
          style
        ),
        " with columns: `ymin` and `ymax`\n\n"
      )
    )
  } else {
    cat(
      sprintf(
        "A `rootogram` object with `scale = '%s'` and `style = '%s'`\n\n",
        scale,
        style
      )
    )
  }

  ## call next print method
  NextMethod()
}
Any scripts or data that you put into this service are public.
topmodels documentation built on Sept. 10, 2022, 3 p.m.
rdrr.io home R language documentation Run R code online
CRAN packages Bioconductor packages R-Forge packages GitHub packages
Note that we can't provide technical support on individual packages. You should contact the package authors for that.
topmodels
Infrastructure for Forecasting and Assessment of Probabilistic Models

R/rootogram.R
In topmodels: Infrastructure for Forecasting and Assessment of Probabilistic Models

Defines functions autoplot.rootogram plot.rootogram c.rootogram rootogram.default rootogram

Documented in autoplot.rootogram c.rootogram plot.rootogram rootogram rootogram.default

Try the topmodels package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

topmodels Infrastructure for Forecasting and Assessment of Probabilistic Models

R/rootogram.R In topmodels: Infrastructure for Forecasting and Assessment of Probabilistic Models

Defines functions autoplot.rootogram plot.rootogram c.rootogram rootogram.default rootogram

Documented in autoplot.rootogram c.rootogram plot.rootogram rootogram rootogram.default

Try the topmodels package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

topmodels
Infrastructure for Forecasting and Assessment of Probabilistic Models

R/rootogram.R
In topmodels: Infrastructure for Forecasting and Assessment of Probabilistic Models