opera-package: Online Prediction by ExpeRt Aggregation

Description Author(s) References Examples

Description

The package opera performs, for regression-oriented time-series, predictions by combining a finite set of forecasts provided by the user. More formally, it considers a sequence of observations Y (such as electricity consumption, or any bounded time series) to be predicted step by step. At each time instance t, a finite set of experts (basicly some based forecasters) provide predictions x of the next observation in y. This package proposes several adaptive and robust methods to combine the expert forecasts based on their past performance.

Author(s)

Pierre Gaillard <[email protected]>

References

Prediction, Learning, and Games. N. Cesa-Bianchi and G. Lugosi.

Forecasting the electricity consumption by aggregating specialized experts; a review of sequential aggregation of specialized experts, with an application to Slovakian an French contry-wide one-day-ahead (half-)hourly predictions, Machine Learning, in press, 2012. Marie Devaine, Pierre Gaillard, Yannig Goude, and Gilles Stoltz

Contributions to online robust aggregation: work on the approximation error and on probabilistic forecasting. Pierre Gaillard. PhD Thesis, University Paris-Sud, 2015.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
#' 
library('opera')  # load the package
set.seed(1)

# Example: find the best one week ahead forecasting strategy (weekly data)
# packages
library(mgcv)

# import data
data(electric_load)
idx_data_test <- 620:nrow(electric_load)
data_train <- electric_load[-idx_data_test, ]
data_test <- electric_load[idx_data_test, ]

# Build the expert forecasts 
# ##########################

# 1) A generalized additive model
gam.fit <- gam(Load ~ s(IPI) + s(Temp) + s(Time, k=3) + 
                s(Load1) + as.factor(NumWeek), data = data_train)
gam.forecast <- predict(gam.fit, newdata = data_test)

# 2) An online autoregressive model on the residuals of a medium term model

# Medium term model to remove trend and seasonality (using generalized additive model)
detrend.fit <- gam(Load ~ s(Time,k=3) + s(NumWeek) + s(Temp) + s(IPI), data = data_train)
electric_load$Trend <- c(predict(detrend.fit), predict(detrend.fit,newdata = data_test))
electric_load$Load.detrend <- electric_load$Load - electric_load$Trend

# Residual analysis
ar.forecast <- numeric(length(idx_data_test))
for (i in seq(idx_data_test)) {
 ar.fit <- ar(electric_load$Load.detrend[1:(idx_data_test[i] - 1)])
 ar.forecast[i] <- as.numeric(predict(ar.fit)$pred) + electric_load$Trend[idx_data_test[i]]
}

# Aggregation of experts
###########################

X <- cbind(gam.forecast, ar.forecast)
colnames(X) <- c('gam', 'ar')
Y <- data_test$Load

matplot(cbind(Y, X), type = 'l', col = 1:6, ylab = 'Weekly load', xlab = 'Week')


# How good are the expert? Look at the oracles
oracle.convex <- oracle(Y = Y, experts = X, loss.type = 'square', model = 'convex')
plot(oracle.convex)
oracle.convex

# Is a single expert the best over time ? Are there breaks ?
oracle.shift <- oracle(Y = Y, experts = X, loss.type = 'percentage', model = 'shifting')
plot(oracle.shift)
oracle.shift

# Online aggregation of the experts with BOA
#############################################

# Initialize the aggregation rule
m0.BOA <- mixture(model = 'BOA', loss.type = 'square')

# Perform online prediction using BOA There are 3 equivalent possibilities 1)
# start with an empty model and update the model sequentially
m1.BOA <- m0.BOA
for (i in 1:length(Y)) {
 m1.BOA <- predict(m1.BOA, newexperts = X[i, ], newY = Y[i])
}

# 2) perform online prediction directly from the empty model
m2.BOA <- predict(m0.BOA, newexpert = X, newY = Y, online = TRUE)

# 3) perform the online aggregation directly
m3.BOA <- mixture(Y = Y, experts = X, model = 'BOA', loss.type = 'square')

# These predictions are equivalent:
identical(m1.BOA, m2.BOA)  # TRUE
identical(m1.BOA, m3.BOA)  # TRUE

# Display the results
summary(m3.BOA)
plot(m1.BOA)

Example output

Loading required package: nlme
This is mgcv 1.8-20. For overview type 'help("mgcv-package")'.
Warning message:
In bestConvex(Y, experts, awake = awake, loss.type = loss.type,  :
  The best convex oracle is only approximated (using optim).
Call:
oracle.default(Y = Y, experts = X, model = "convex", loss.type = "square")

Coefficients:
   gam    ar
 0.751 0.249

                      rmse   mape
Best expert oracle:   1480 0.0202
Uniform combination:  1480 0.0206
Best convex oracle:   1450 0.0200
Call:
oracle.default(Y = Y, experts = X, model = "shifting", loss.type = "percentage")

      0 shifts 28 shifts 55 shifts 83 shifts 111 shifts
mape:   0.0202    0.0159    0.0154     0.016     0.0209
There were 50 or more warnings (use warnings() to see the first 50)
There were 50 or more warnings (use warnings() to see the first 50)
There were 50 or more warnings (use warnings() to see the first 50)
[1] FALSE
[1] TRUE
Aggregation rule: BOA 
Loss function:  square loss 
Gradient trick:  TRUE 
Coefficients: 
   gam    ar
 0.689 0.311

Number of experts:  2
Number of observations:  112
Dimension of the data:  1 

        rmse   mape
BOA     1470 0.0201
Uniform 1480 0.0206

opera documentation built on May 29, 2017, 10:32 p.m.