RootDeviance: Root Distance Measure

View source: R/RootDeviance.R

RootDevianceR Documentation

Root Distance Measure

Description

Given two equidistant time series, X and Y, having the same underlying time step set (in particular having the same length), represented by numerical vectors, the sum root distance describes the period-wise square-rooted deviance to the upper and lower side of X respectively Y seperated. We can describe this behavior mathematically by the complex root:

Let E denote the difference time series of X respectively Y: Et=Xt-Yt. We see that in period t X is larger than Y iff (if and only if) Et<0. Conversely X is less than Y in period t iff Et<0. The equality periods are given by the zeroes of E.

By applying the complex root function on the time series E we get a complex time series \sqrt{E}. In period t we formulate \sqrt{E_t} = A_t + i B_t. We easily see from the earlier observation that A is the time series consisting of the roots of E in periods t, where X is larger than Y and is zero else. Analogously B is the time series with the values root of E in periods where E is less than zero, i.e. the periods where X is less than Y.

Hence summing over all periods yields a complex number a + i b := \sum_{t\in T} A_t + i\sum_{t\in T} B_t with 1-norm \text{SRD}(X,Y) = a+b = \sum_{t\in T} \sqrt{|X_t-Y_t|}, the sum root deviance, being the sum of the period-wise root deviances, being zero iff X = Y. Using this we can also define the mean root deviance \text{MRE}(X,Y) := \frac{\text{SRE}(X,Y)}{n}, where n denotes the size of T, respectively the length of X and Y.

But we can harvest even more information about the similarity of our time series from the complex number a+ib if it is not 0. If it is, we already have the same time series. Understanding this as a vector in the euclidean plane along with the 1-norm, the length of this vector becomes the sum root deviance SRD(X,Y). The angle it encloses with the x-axis (since both components are non-negative, meaning it lives in the first quadrant [upper right]) is \text{arctan}(\frac{b}{a}) (where we calculate in \overline{\mathbb{R}}, meaning if a is 0 [i.e. in every period t where X and Y differ, we always have X_t < Y_t] we set the fraction to be \infty and \text{arctan}(\infty) = \frac{\pi}{2} per limit of the arcus-tangus). Since \frac{b}{a} lives between 0 and \infty the arcustangens yields a number between 0 and \frac{\pi}{2}, by which we now can read off the tendency of X to deviate over or under Y. a+ib is on the diagonal f(x) = x iff a = b, meaning that we have the same rooted deviance to the top as we have to the bottom. To make this even more easy to read off, we map this number to the interval [-1,1], defining \text{bias}(X,Y) := 1 - \frac{4}{\pi}\text{arctan}(\frac{b}{a}). The upper observation yields, that the bias is 0 iff a = b (same rooted error to the upper as to the lower side), 1 iff \text{arctan}(\frac{b}{a}) = 0, i.e. iff b = 0 and -1 iff \text{arctan}(\frac{b}{a}) = \frac{\pi}{2}, i.e. iff a = 0.

We also like to consider this complex number in its polar coordinate representation, its length is \sqrt{a^2+b^2} with the angle bias(X,Y). The length of this vector will be put out in the length attribute.

The special case is the one of a forecast Y of X, we measure the quality of the forecast in terms of having a tendency/bias to rather be over or under the actual value or the overall quality with outliers smoothed out in terms of fitting near. The square root has a really slow growth (\sqrt{x}\ll x) causing this outlier insensitivity. Therefore minimizing the norm of a+ib, i.e. the SRD, we optimize the forecasts overall fit (while nearly neglecting outliers) and depending on the wanted forecast behavior (undercast or overcast - i.e. the forecast deviating to the top or bottom side) one can optimize parameters to make forecast strategy tend to do so. In this application we also like to call the sum root deviance (SRD) the sum root error (SRE) of the forecast. Same goes for the mean.

Usage

RootDeviance(x,y,Silent=FALSE)

Arguments

x

[1:n] numerical vector of time series data

y

[1:n] numerical vector of time series data. This may be a forecast (order matters for the bias to have the right sign).

Silent

TRUE: No Warnings or errors are given back

Details

Calculating the above for the time series represented by x and y.

Value

A list with the following named members

* SRD = SRD(x,y) - Sum of the root errors.

* MRD = MRD(x,y) - Mean of the root errors. (SRD / length(x))

* length = sqrt(a^2 + b^2) - Length of the complex number with real part the errors where X is over Y and imaginary part the errors where X is under Y.

* bias = bias(x,y) - Is a number bound between -1 and 1. It is 0 iff Y has the same rooted error to the upper as it has to the lower side of X. Positivity means Y deviates more to the lower side of X and conversly negativity reads as Y deviates more the upper side of X.

Author(s)

Julian Maerte

References

Kourentzes, N., Trapero, J. R., & Svetunkov, I.: Measuring the behaviour of experts on demand forecasting: a complex task, 2014.


Mthrun/TSAT documentation built on Feb. 5, 2024, 11:15 p.m.