procast: Procast: Probabilistic Forecasting
In topmodels: Infrastructure for Forecasting and Assessment of Probabilistic Models

procast

R Documentation

Procast: Probabilistic Forecasting

Description

Generic function and methods for computing various kinds of probabilistic forecasts from (regression) models.

Usage

procast(
  object,
  newdata = NULL,
  na.action = na.pass,
  type = "quantile",
  at = 0.5,
  drop = FALSE,
  ...
)

## Default S3 method:
procast(
  object,
  newdata = NULL,
  na.action = na.pass,
  type = c("quantile", "mean", "variance", "probability", "density", "loglikelihood",
    "distribution", "parameters", "kurtosis", "skewness"),
  at = 0.5,
  drop = FALSE,
  ...
)

procast_setup(
  pars,
  FUN,
  at = NULL,
  drop = FALSE,
  type = "procast",
  elementwise = NULL,
  ...
)

## S3 method for class 'disttree'
procast(
  object,
  newdata = NULL,
  na.action = na.pass,
  type = c("quantile", "location", "scale", "parameter", "density", "probability"),
  at = 0.5,
  drop = FALSE,
  use_distfamily = TRUE,
  ...
)

## S3 method for class 'distforest'
procast(
  object,
  newdata = NULL,
  na.action = na.pass,
  type = c("quantile", "location", "scale", "parameter", "density", "probability"),
  at = 0.5,
  drop = FALSE,
  use_distfamily = TRUE,
  ...
)

Arguments

`object`	a fitted model object. For the `default` method this needs to have a `prodist` method (or `object` can inherit from `distribution` directly).
`newdata`	optionally, a data frame in which to look for variables with which to predict. If omitted, the original observations are used.
`na.action`	function determining what should be done with missing values in `newdata`. The default is to employ `NA`.
`type`	character specifying the type of probabilistic forecast to compute. In `procast_setup` the `type` is only used for nice labels of the returned data frame.
`at`	specification of values at which the forecasts should be evaluated, typically a numeric vector but possibly also a matrix or data frame. Additionally, `at` can be the character string `"function"` or `"list"`, see details below.
`drop`	logical. Should the result be simplified to a vector if possible (by dropping the dimension attribute)? If `FALSE` a matrix is always returned.
`...`	further parameters passed to methods.
`pars`	a data frame of predicted distribution parameters.
`FUN`	function to be used for forecasts. Either of type `FUN(pars, ...)` or `FUN(at, pars, ...)`, see details below.
`elementwise`	logical. Should each element of distribution only be evaluated at the corresponding element of `at` (`elementwise = TRUE`) or at all elements in `at` (`elementwise = FALSE`). Elementwise evaluation is only possible if the number of observations is length of `at` are the same and in that case a vector of the same length is returned. Otherwise a matrix is returned. The default is to use `elementwise = TRUE` if possible, and otherwise `elementwise = FALSE`.
`use_distfamily`	For intern use only, will not be supported in the future.

Details

The function procast provides a unified framework for probabilistic forcasting (or procasting, for short) based on probabilistic (regression) models, also known as distributional regression approaches. Typical types of predictions include quantiles, probabilities, (conditional) expectations, variances, and (log-)densities. Internally, procast methods typically compute the predicted parameters for each observation and then compute the desired outcome for the distributions with the respective parameters.

Some quantities, e.g., the moments of the distribution (like mean or variance), can be computed directly from the predicted parameters of the distribution while others require an additional argument at which the distribution is evaluated (e.g., the probability of a quantile or an observation of the response.

The default procast method leverages the S3 classes and methods for probability distributions from the distributions3 package. It proceeds in two steps: First, prodist is used to obtain the predicted probability distribution object. Second, the extractor methods such as quantile, cdf, etc. are used to compute quantiles, probabilities, etc. from the distribution objects.

Therefore, to enable procast for a certain type of model object, the recommended approach is to implement a prodist method which can then be leveraged. However, if the distributions3 package does not support the necessary probability distribution, then it may also be necessary to implement a new distribution objects, see apply_dpqr.

Before adopting the distributions3 framework as the recommended workflow for procasting, the package had taken a different approach which is described in the following. Note, however, that this will be discontinued when we have converted all procasting methods to the new workflow.

Old workflow: The function procast_setup is a convenience wrapper that makes setting up procast methods easier for package developers. It takes a data frame of predicted parameters pars and a function FUN which is to be evaluated at the parameters. This can either have the interface FUN(pars, ...) when the desired quantity can be predicted directly from the predicted parameters – or the interface FUN(at, pars, ...) if an additional argument at is needed. procast_setup takes care of suitable expanding at to the dimensions of pars.

Value

Either a data.frame of predictions (in case of multivariate forecasts, or if drop = FALSE, default) or a vector (in case of a univariate forecast and additionally drop = TRUE). Unless at is the character string "function" or "list" in which case a (list of) function(s) is returned.

Examples

## linear regression models (homoscedastic Gaussian response)
m <- lm(dist ~ speed, data = cars)

## medians on observed data
procast(m)
procast(m, drop = TRUE)

## probability integral transform (PIT) on observed data
procast(m, type = "probability", at = cars$dist)

## log-likelihood contributions
procast(m, type = "density", at = cars$dist, log = TRUE)

## log-likelihood sum
sum(procast(m, type = "density", at = cars$dist, log = TRUE))
logLik(m)


## medians on new data
nd <- data.frame(speed = c(10, 15, 20))
procast(m, newdata = nd)

## different quantile for each observation
procast(m, newdata = nd, at = c(0.25, 0.5, 0.75), elementwise = TRUE)

## all combinations of quantiles and observations
procast(m, newdata = nd, at = c(0.25, 0.5, 0.75), elementwise = FALSE)

topmodels documentation built on Sept. 10, 2022, 3 p.m.