R/cptForecast.R

Defines functions cptForecastErrorCheck cptForecast

Documented in cptForecast cptForecastErrorCheck

#' Sequential Changepoint Analysis of Forecast Errors
#'
#' Performs sequential changepoint analysis on supplied forecast errors
#' to detect when forecasts become inaccurate and need modifying.
#'
#' This function is designed to work alongside a forecasting model to automatically detect if/when
#' the forecasting model becomes inaccurate. By monitoring the forecast errors, this function
#' can automatically detect when a changepoint has occurred in the forecast errors indicating that the
#' forecasting model being used needs updating. This function can detect first-order changes (e.g.
#' mean changes) and second-order changes (e.g. variance changes) using the different `forecastErrorTypes`.
#' See below for more details. Once analysis has been performed using this method then
#' the \code{\link{updateForecast}} function can be used to analyse new forecast errors from the same
#' forecasting model with repeating the analysis that has already been performed.
#'
#' The function takes a numeric vector of forecast errors of size `n`. The first `m` time points (as
#' chosen by the user) determines the training period where no changepoints are deemed to have occurred.
#' Monitoring of the forecast errors begins from time point `m+1` and the returned
#' \code{\linkS4class{cptFor}} class contains a wide range of information including the time point
#' a changepoint was detected. This is the time points after monitoring has begun not from the start of
#' numeric vector of errors.
#'
#' There is a choice of four different detectors to use. In general, we suggest using `detector="PageCUSUM"`
#' unless you are specifically looking for a mean/variance increase in the forecast errors in which case
#' `detector="PageCUSUM1"`is recommended. We only recommend the `CUSUM` detectors if you are confident
#' the changepoint will happen shortly after monitoring has begun. For more information on these detectors
#' and the hyperparameter `gamma` see \insertCite{Fremdt2014;textual}{changepoint.forecast}.
#'
#' The choice of `forecastErrorType` depends on the type of change you expect to see in the data. This
#' defaults to `forecastErrorType="Both"` which performs analysis on the raw forecast errors (designed for
#' detecting mean changes) and the squared forecast errors (designed for detecting variance changes). If
#' you are only interested in mean (first-order) changes then you can chose to only use the raw forecast
#' errors with `forecastErrorType="Raw"` or similarly for detecting mean and/or variance changes you can
#' just use the squared forecast errors with `forecastErrorType="Squared"`. For more information, see
#' \insertCite{Grundy2021;textual}{changepoint.forecast}.
#'
#'  The crucial parameter for obtaining fast detection of changes depends upon the `critValue` which
#'  affects the threshold the CUSUM statistic must exceed to detect a change. A number of theoretical
#'  critical values are stored in look-up tables which can be accessed using `crtiVale="Lookup"`. This
#'  depends on the choice of `detector`, `gamma` and `alpha`, where `alpha` is the probability of a
#'  false alarm as `m` gets large. If the critical value for your chosen parameters is not
#'  available then this can be simulated using `critValue="Simulate"`. Here you can choose some additional
#'  hyperparameters (`samples` and `npts`) which alter how the critical value is simulated. Note for
#'  larger values of `samples` and `npts` dependent on the `detector` this can take a substantial
#'  amount of time to run. For more information on `samples` and `npts` see the main vignette using
#'  `vignette('changepoint.forecast', package='changepoint.forecast')`. Finally, a numeric can be
#'  entered for `critValue` i.e. from a previous call to \code{\link{simCritVal}}. Note this numeric is not the threshold used but a part of it. For more
#'  details on the full threshold used see \insertCite{Grundy2021;textual}{changepoint.forecast}.
#'
#'  Finally, see \code{\linkS4class{cptFor}} for more information on the resulting S4 class object and
#'  how to use this to get relevant information from the analysis. Note the use of `@` rather than `$`
#'  to access slots.
#'
#'
#'
#' @param errors numeric vector. Forecast errors to perform changepoint analysis upon.
#' @param m numeric. Length of training period where forecast errors are assumed stable.
#' @param detector character. Type of changepoint detector to use. Choice of
#' \itemize{
#'   \item{"PageCUSUM": }{Page's CUSUM detector for 2-sided alternative hypothesis.}
#'   \item{"PageCUSUM1": }{Page's CUSUM detector for 1-sided alternative hypothesis.}
#'   \item{"CUSUM": }{Original CUSUM detector for 2-sided alternative hypothesis.}
#'   \item{"CUSUM1": }{Original CUSUM detector for 1-sided alternative hypothesis.}
#' }
#' See details below.
#' @param forecastErrorType character. Type of changes to look for. Choice of
#' \itemize{
#'   \item{"Both": }{Analysis is performed on both the raw and squared forecast errors,}
#'   \item{"Raw": }{Analysis is only performed on the raw forecast errors. Only first
#'   order changes are reliably detected.}
#'   \item{"Squared:" }{Analysis is only performed on the centred squared forecast errors.}
#' }
#' See details below.
#' @param gamma numeric. Tuning parameter used in detector. See details below.
#' @param critValue character or numeric. Critical value of normalized asymptotic distribution
#' of chosen detector under no change. Choice of
#' \itemize{
#'   \item{"Lookup": }{Pre-simulated critical value is used if available}
#'   \item{"Simulate": }{Critical value is simulated. Note this can be time consuming
#'   in certain scenarios}
#'   \item{numeric: }{Specified critical value to be used}
#' }
#' See details below
#' @param alpha Type-1 error
#' @param samples Only used if `critValue='Simulate'`. Number of sample of Weiner processes
#' used to calculate critical value.
#' @param npts  Only used if `critValue='Simulate'`. Number of points in each sample of a
#' Weiner process used to calculate critical value.
#'
#' @return an object of class `cptFor`
#'
#' @importFrom Rdpack reprompt
#' @export
#'
#' @references
#' \insertRef{Fremdt2014}{changepoint.forecast}
#'
#' \insertRef{Grundy2021}{changepoint.forecast}
#'
#' @examples
#' # Mean change in forecast errors
#' forecastErrors = c(stats::rnorm(400), stats::rnorm(100,2))
#' ans = cptForecast(forecastErrors, m=300)
#' summary(ans)
#' plot(ans)
#'
#' # Variance change in forecast errors
#' forecastErrors2 = c(stats::rnorm(400), stats::rnorm(100,0,3))
#' ans2 = cptForecast(forecastErrors, m=300, forecastErrorType="Squared")
#' show(ans)
#' plot(ans)
cptForecast = function(errors,
                       m=ceiling(length(errors)/2),
                       detector='PageCUSUM',
                       forecastErrorType='Both',
                       gamma=0,
                       critValue='Lookup',
                       alpha=0.05,
                       samples=1000,
                       npts=500){
  cptForecastErrorCheck(errors=errors, m=m, detector=detector,
                        forecastErrorType=forecastErrorType, gamma=gamma,
                        critValue=critValue, alpha=alpha, samples=samples, npts=npts)
  if(forecastErrorType=='Both'){
    ans = cptSeqCUSUM(errors, m=m, detector=detector, gamma=gamma, sigma2=NULL,
                      critValue=critValue, alpha=alpha, samples=samples,
                      npts=npts, Class=FALSE)
    ans2 = cptSeqCUSUM((errors-mean(errors[1:m]))^2, m=m, detector=detector, gamma=gamma,
                       sigma2=NULL, critValue=critValue, alpha=alpha, samples=samples,
                       npts=npts, Class=FALSE)
    return(cptFor(errors=errors, m=m, gamma=gamma, detector=detector,
                  forecastErrorType=forecastErrorType,
                  alpha=alpha, critValue=ans$critValue,
                  errorsVar=ans$sigma2, errors2Var=ans2$sigma2,
                  cusum=ans$cusum, cusum2=ans2$cusum,
                  threshold=ans$threshold, threshold2=ans2$threshold,
                  tau=ans$tau, tau2=ans2$tau,
                  updateStats=ans$updateStats, updateStats2=ans2$updateStats))
  }else if(forecastErrorType=='Raw'){
    ans = cptSeqCUSUM(errors, m=m, detector=detector, gamma=gamma, sigma2=NULL,
                      critValue=critValue, alpha=alpha, samples=samples,
                      npts=npts, Class=FALSE)
    return(cptFor(errors=errors, m=m, gamma=gamma, detector=detector, alpha=alpha,
                  forecastErrorType=forecastErrorType,
                  critValue = ans$critValue,
                  errorsVar=ans$sigma2, cusum=ans$cusum,
                  threshold=ans$threshold, tau=ans$tau, updateStats=ans$updateStats))
  }else if(forecastErrorType=='Squared'){
    ans2 = cptSeqCUSUM((errors-mean(errors[1:m]))^2, m=m, detector=detector, gamma=gamma,
                       sigma2=NULL, critValue=critValue, alpha=alpha, samples=samples,
                       npts=npts, Class=FALSE)
    return(cptFor(errors=errors, m=m, gamma=gamma, detector=detector, alpha=alpha,
                  forecastErrorType=forecastErrorType,
                  critValue=ans2$critValue,
                  errors2Var=ans2$sigma2, cusum2=ans2$cusum,
                  threshold2=ans2$threshold, tau2=ans2$tau,
                  updateStats=c(mean(errors[1:m]), 0, 0), updateStats2=ans2$updateStats))
  }else{
    stop('forecastErrorType not supported. Please use one of "Both", "Raw" or "Squared"')
  }
}


#' Error checking - cptForecast
#'
#' Performs error checking on the arguments passed to cptForecast
#'
#' @inheritParams cptForecast
#'
#' @return NULL
cptForecastErrorCheck = function(errors=errors, m=m, detector=detector,
                                 forecastErrorType=forecastErrorType, gamma=gamma,
                                 critValue=critValue, alpha=alpha, samples=samples, npts=npts){
  ## Errors
  if(any(!is.numeric(errors))){
    stop("Errors should be a vector of numeric values with no NA values")
  }else if(any(is.na(errors))||any(errors==Inf)){
    stop("Errors should be a vector of numeric values with no NA values")
  }

  ## m
  if(length(m)!=1 || is.na(m)){
    stop("m should be a single positive integer such that 1<m<n, where n is the length of
         the vector of errors")
  }else if(m != as.integer(m)){
    stop("m should be a single positive integer such that 1<m<n, where n is the length of
         the vector of errors")
  }else if((m<2)||(m>=length(errors))){
    stop("m should be a single positive integer such that 1<m<n, where n is the length of
         the vector of errors")
  }

  ## detector
  if(!(detector %in% c("CUSUM", "CUSUM1", "PageCUSUM", "PageCUSUM1"))){
    stop("Detector must be one of 'CUSUM', 'CUSUM1', 'PageCUSUM', 'PageCUSUM1'. Note these
    are case-sensitive")
  }

  ## forecastErrorType
  if(!(forecastErrorType %in% c("Both", "Raw", "Squared"))){
    stop("forecastErrorType must be one of 'Both', 'Raw', 'Squared'. Note these are case-sensitive")
  }

  ## gamma
  if(any(!is.numeric(gamma))||any(is.na(gamma))){
    stop("gamma should be a single positive number such that 0<=gamma<0.5")
  }else if(length(gamma)!=1){
    stop("gamma should be a single positive number such that 0<=gamma<0.5")
  }else if(!((gamma>=0)&&(gamma<0.5))){
    stop("gamma should be a single positive number such that 0<=gamma<0.5")
  }

  ## critValue
  if(any(is.numeric(critValue))){
    if(length(critValue)!=1){
      stop("critValue should either be 'Lookup', 'Simulate' or a single positive numeric")
    }else if((critValue<0)||(critValue==Inf)||is.na(critValue)){
      stop("critValue should either be 'Lookup', 'Simulate' or a single positive numeric")
    }
  }else if(!(critValue %in% c("Lookup", "Simulate"))){
    stop("critValue should either be 'Lookup', 'Simulate' or a single positive numeric")
  }

  ## alpha
  if(any(!is.numeric(alpha))||any(is.na(alpha))){
    stop("alpha should be a single positive number such that 0<=alpha<=1")
  }else if(length(alpha)!=1){
    stop("alpha should be a single positive number such that 0<=alpha<=1")
  }else if((alpha<0)||(alpha>1)){
    stop("alpha should be a single positive number such that 0<=alpha<=1")
  }

  ## samples
  if(any(!is.numeric(samples))||any(is.na(samples))){
    stop("samples should be a single positive integer greater than 20")
  }else if(length(samples)!=1){
    stop("samples should be a single positive integer greater than 20")
  }else if(samples==Inf || samples != as.integer(samples)){
    stop("samples should be a single positive integer greater than 20")
  }else if(samples<20){
    stop("samples should be a single positive integer greater than 20")
  }

  ## npts
  if(any(!is.numeric(npts))||any(is.na(npts))){
    stop("npts should be a single positive integer greater than 20")
  }else if(length(npts)!=1){
    stop("npts should be a single positive integer greater than 20")
  }else if(npts==Inf || npts != as.integer(npts)){
    stop("npts should be a single positive integer greater than 20")
  }else if(npts<20){
    stop("npts should be a single positive integer greater than 20")
  }
}
grundy95/changepoint.forecast documentation built on Dec. 20, 2021, 1:45 p.m.