getHisEnsem: Get ensemble forecast from historical data.

getHisEnsemR Documentation

Get ensemble forecast from historical data.

Description

getHisEnsem use historical data as the forecasting input time series.

Usage

getHisEnsem(
  TS,
  example,
  interval = 365,
  buffer = 0,
  plot = "norm",
  output = "data",
  name = NULL,
  mv = 0,
  ...
)

Arguments

TS

A time series dataframe, with first column Date, and second column value.

example

A vector containing two strings showing the start and end date, which represent the forecasting period. Check details for more information.

the program will extract every possible period in TS you provided to generate the ensemble. Check details for more information.

interval

A number representing the interval of each ensemble member. NOTE: "interval" takes 365 as a year, and 30 as a month, regardless of leap year and months with 31 days. So if you want the interval to be 2 years, set interval = 730, which equals 2 * 365 ; if two months, set interval = 60; 2 days, interval = 2, for other numbers that cannot be divided by 365 or 30 without remainder, it will treat the number as days.By defualt interval is set to be 365, a year.

buffer

A number showing how many days are used as buffer period for models. Check details for more information.

plot

A string showing whether the plot will be shown, e.g., 'norm' means normal plot (without any process), 'cum' means cummulative plot, default is 'norm'. For other words there will be no plot.

output

A string showing which type of output you want. Default is "data", if "ggplot", the data that can be directly plotted by ggplot2 will be returned, which is easier for you to make series plots afterwards. NOTE: If output = 'ggplot', the missing value in the data will be replaced by mv, if assigned, default mv is 0.

name

If output = 'ggplot', name has to be assigned to your output, in order to differentiate different outputs in the later multiplot using getEnsem_comb.

mv

A number showing representing the missing value. When calculating the cumulative value, missing value will be replaced by mv, default is 0.

...

title, x, y showing the title and x and y axis of the plot. e.g. title = 'aaa'

Details

example E.g., if you have a time series from 2000 to 2010. Assuming you are in 2003, you want to forecast the period from 2003-2-1 to 2003-4-1. Then for each year in your input time series, every year from 1st Feb to 1st Apr will be extracted to generate the ensemble forecasts. In this case your input example should be example = c('2003-2-1', '2003-4-1')

interval doesn't care about leap year and the months with 31 days, it will take 365 as a year, and 30 as a month. e.g., if the interval is from 1999-2-1 to 1999-3-1, you should just set interval to 30, although the real interval is 28 days.

example and interval controls how the ensemble will be generated. e.g. if the time series is from 1990-1-1 to 2001-1-1.

if example = c('1992-3-1', '1994-1-1') and interval = 1095, note, 1095 = 365 * 3, so the program treat this as 3 years.

Then you are supposed to get the ensemble consisting of following part:

1. 1992-3-1 to 1994-1-1 first one is the example, and it's NOT start from 1990-3-1. 2. 1995-3-1 to 1997-1-1 second one starts from 1993, because "interval" is 3 years. 3. 1998-3-1 to 2000-1-1

because the last one "2000-3-1 to 2002-1-1", 2002 exceeds the original TS range, so it will not be included.

Sometimes, there are leap years and months with 31 days included in some ensemble part, in which case the length of the data will be different, e.g., 1999-1-1 to 1999-3-1 is 1 day less than 2000-1-1 to 2000-3-1. In this situation, the data will use example as a standard. If the example is 1999-1-1 to 1999-3-1, then the latter one will be changed to 2001-1-1 to 2000-2-29, which keeps the start Date and change the end Date.

If the end date is so important that cannot be changed, try to solve this problem by resetting the example period, to make the event included in the example.

Good set of example and interval can generate good ensemble.

buffer Sometimes the model needs to run for a few days to warm up, before the forecast. E.g., if a forecast starts at '1990-1-20', for some model like MIKE NAM model, the run needs to be started about 14 days. So the input timeseries should start from '1990-1-6'.

Buffer is mainly used for the model hotstart. Sometimes the hot start file cannot contain all the parameters needed, only some important parameters. In this case, the model needs to run for some time, to make other parameters ready for the simulation.

name Assuming you have two ggplot outputs, you want to plot them together. In this situation, you need a name column to differentiate one ggplot output from the other. You can assigne this name by the argument directly, name has to be assigned if output = 'ggplot' is selected,

Value

A ensemble time series using historical data as forecast.

References

  • Hadley Wickham (2007). Reshaping Data with the reshape Package. Journal of Statistical Software, 21(12), 1-20. URL http://www.jstatsoft.org/v21/i12/.

  • H. Wickham. ggplot2: elegant graphics for data analysis. Springer New York, 2009.

Examples


data(testdl)

a <- testdl[[1]]

# Choose example from "1994-2-4" to "1996-1-4"
b <- getHisEnsem(a, example = c('1994-2-4', '1996-1-4'))

# Default interval is one year, can be set to other values, check help for information.

# Take 7 months as interval
b <- getHisEnsem(a, example = c('1994-2-4', '1996-1-4'), interval = 210, plot = 'cum') 
# Take 30 days as buffer
b <- getHisEnsem(a, example = c('1994-2-4', '1996-1-4'), interval = 210, buffer = 30)


# More examples can be found in the user manual on https://yuanchao-xu.github.io/hyfo/



hyfo documentation built on Aug. 16, 2023, 5:08 p.m.