knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-",
  out.width = "100%"
)

tsdataleaks

CRAN status](https://CRAN.R-project.org/package=tsdataleaks)

R Package for detecting data leakages in time series forecasting competitions.

Installation

The development version from GitHub with:

install.packages("tsdataleaks")
library(tsdataleaks)

or

# install.packages("devtools")
devtools::install_github("thiyangt/tsdataleaks")
library(tsdataleaks)

Example

To demonstrate the package functions, I created a small data set with 4 time series.

set.seed(2020)
a <- rnorm(15)
d <- rnorm(10)
lst <- list(
  a = a,
  b = c(a[10:15]+rep(8,6), rnorm(10), a[1:5], a[1:5]),
  c = c(rnorm(10), -a[1:5]),
  d = d,
  e = d)

find_dataleaks: Exploit data leaks

library(tsdataleaks)
library(magrittr)
library(tidyverse)
library(viridis)
# h - I assume test period length is 5 and took that as wind size, h.
f1 <- find_dataleaks(lstx = lst, h=5, cutoff=1) 
f1

Interpretation: The first element in the list means the last 5 observations of the time series a correlates with time series b observarion from 2 to 6.

viz_dataleaks: Visualise the data leaks

viz_dataleaks(f1)

reason_dataleaks

Display the reasons for data leaks and evaluate usefulness of data leaks towards the winning of the competition

r1 <- reason_dataleaks(lstx = lst, finddataleaksout = f1, h=5)
r1

A list without naming element

a = rnorm(15)
lst <- list(
  a,
  c(a[10:15], rnorm(10), a[1:5], a[1:5]),
  c(rnorm(10), a[1:5])
)
f1 <- find_dataleaks(lst, h=5)
viz_dataleaks(f1)
reason_dataleaks(lst, f1, h=5)

Application to M-Competition data

M1 Competition - Yearly data

library(Mcomp)
data("M1")
M1Y <- subset(M1, "yearly")
M1Y_x <- lapply(M1Y, function(temp){temp$x})
m1y_f1 <- find_dataleaks(M1Y_x, h=6, cutoff = 1)
m1y_f1
viz_dataleaks(m1y_f1)
reason_dataleaks(M1Y_x, m1y_f1, h=6, ang=90)


thiyangt/tsdataleaks documentation built on Feb. 21, 2024, 3:26 p.m.