Description Usage Arguments Details Value Note Author(s) References See Also Examples
This function is able to calculate a series of numeric time series evaluation statistics given two vectors: one with the true target variable values, and the other with the predicted target variable values.
1 2 3 4 |
trues |
A numeric vector with the true values of the target variable. |
preds |
A numeric vector with the predicted values of the target variable. |
stats |
A vector with the names of the evaluation statistics to
calculate. Possible values are "mae", "mse", "rmse", "mape", "nmse",
"nmae" or "theil". The three latter require that the parameter |
train.y |
In case the set of statistics to calculate include either "nmse", "nmae" or "theil", this parameter should contain a numeric vector with the values of the target variable on the set of data used to obtain the model whose performance is being tested. |
The evaluation statistics calculated by this function belong to two different groups of measures: absolute and relative. The former include "mae", "mse", and "rmse" and are calculated as follows:
"mae": mean absolute error, which is calculated as sum(|t_i - p_i|)/N, where t's are the true values and p's are the predictions, while N is supposed to be the size of both vectors.
"mse": mean squared error, which is calculated as sum( (t_i - p_i)^2 )/N
"rmse": root mean squared error that is calculated as sqrt(mse)
The remaining measures ("mape", "nmse", "nmae" and "theil") are relative
measures, the three later
comparing the performance of the model with a baseline. They are
unit-less measures with values always greater than 0. In the case of
"nmse", "nmae" and "theil" the values are expected to be in the interval [0,1]
though occasionaly scores can overcome 1, which means that your model
is performing worse than the baseline model. The baseline used in our
implementation for metrics "nmse" and "nmae" is a constant model that always predicts the average
target variable value, estimated using the values of this variable on
the training data (data used to obtain the model that generated the
predictions), which should be
given in the parameter train.y
. The baseline used for
calculating the Theil coefficient ("theil") is the model that predicts
for time t+1 the value of the time series on time t, i.e. the last
known value. The relative error measure
"mape" does not require a baseline. It simply calculates the average
percentage difference between the true values and the
predictions.
These measures are calculated as follows:
"mape": sum(|(t_i - p_i) / t_i|)/N
"nmse": sum( (t_i - p_i)^2 ) / sum( (t_i - AVG(Y))^2 ), where AVG(Y)
is the average of the values provided in vector train.y
"nmae": sum(|t_i - p_i|) / sum(|t_i - AVG(Y)|)
"theil": sum( (t_i - p_i)^2 ) / sum( (t_i - t_[i-1])^2 ), where t_[i-1] is the last known value of the series when we are trying to forecast the value t_i
A named vector with the calculated statistics.
In case you require either "nmse", "nmae" or "theil" to be calculated you must
supply a vector of numeric values through the parameter
train.y
, otherwise the function will return an error
message.
Luis Torgo ltorgo@dcc.fc.up.pt
Torgo, L. (2010) Data Mining using R: learning with case studies, CRC Press (ISBN: 9781439810187).
http://www.dcc.fc.up.pt/~ltorgo/DataMiningWithR
1 2 3 4 5 6 7 |
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.