R/dss.R

Defines functions dss

Documented in dss

#' Dawid-Sebastiani Score
#'
#' This function calculates the Dawid-Sebastiani Score (DSS) given observations of an univariate variable and samples or parameters of a predictive distribution.
#'
#' @param y vector of observations
#' @param x matrix of samples of a predictive distribution or vector of variances of a predictive distribution (depending on \code{y}; see details)
#' @param mu if \code{NULL}, \code{mu} is calculated by the row-wise mean of matrix \code{x}; otherwise \code{mu} must provide the means of a predictive distribution; default: \code{NULL} (depending on \code{x}; see details)
#' @param na.action function to handle the NA's. Default: \code{na.omit}.
#' @param aggregate logical or function for aggregating the single scores, e.g. \code{sum}, \code{mean}, \code{weighted.mean}, ....
#' Default: \code{FALSE}, i.e. no aggregation function.
#' @param ... further arguments passed to the \code{aggregate} function.
#'
#' @details
#' For a vector \code{y} of length n, \code{x} can be given as matrix of samples of a predictive distribution
#' with n rows, where the i-th entry of \code{y} belongs to the i-th row
#' of \code{x}. The columns of \code{x} represent the samples of a predictive distribution.
#' Consequently \code{mu} must be \code{NULL}. The row-wise
#' means and variances are determined by its sample versions.
#'
#' If the variances and means of a predictive distribution are directly
#' available, \code{x} can be given as vector of variances and \code{mu} can be given as vector of means, where
#' the i-th entry of \code{y} belongs to the i-th entry of \code{x} and \code{mu}.
#'
#' A lower DSS indicates a better forecast.
#'
#' @return
#' Vector of score value(s).
#'
#' @examples
#' # simulated data
#' n <- 30
#' m <- 50
#' y <- rnorm(n)
#' x1 <- matrix(rnorm(n*m), ncol = m)
#' x2 <- rep(1, n)
#' mu <- rep(0, n)
#'
#' # dss calculation
#' dss(y = y, x = x1, mu = NULL)
#' dss(y = y, x = x1, mu = NULL, aggregate = mean)
#'
#' dss(y = y, x = x2, mu = mu)
#' dss(y = y, x = x2, mu = mu, aggregate = mean)
#'
#' @references
#' Dawid, A. and Sebastiani, P. (1999). Coherent dispersion criteria for optimal experimental design. Annals of Statistics, 27, 65-81.
#'
#' @author David Jobst
#'
#' @rdname dss
#'
#' @importFrom stats na.omit
#' @importFrom Rfast rowmeans rowVars
#'
#' @export
dss <- function(y, x, mu = NULL, na.action = na.omit, aggregate = FALSE, ...) {
  if (!is.vector(y)) {
    stop("'y' should be a vector!")
  }
  if (is.null(mu)) {
    if (!is.matrix(x)) {
      stop("'x' should be a matrix!")
    }
    if (length(y) != nrow(x)) {
      stop("Length of 'y' is not equal to the number of rows of 'x'!")
    }

    means <- rowmeans(x)
    vars <- rowVars(x, parallel = TRUE, na.rm = FALSE)
    dss.value <- log(vars) + (y-means)^2/vars
  } else {
    if (!is.vector(x)) {
      stop("'x' should be a vector!")
    }
    if (!is.vector(mu)) {
      stop("'mu' should be a vector!")
    }
    if (length(y) != length(x) || length(y) != length(mu)) {
      stop("Lengths of 'y', 'x' and 'mu' are not equal!")
    }

    if (any(x <= 0)) {
      stop("'x' should contain values > 0!")
    }

    dss.value <- log(x) + (y-mu)^2/x

  }

  dss.value <- as.vector(na.action(dss.value))
  if (!isFALSE(aggregate)) {
    dss.value <- do.call(aggregate, list(dss.value, ...))
  }

  return(as.numeric(dss.value))

}
jobstdavid/eppverification documentation built on May 13, 2024, 5:20 p.m.