In Irstea/chocR: Exploring the temporal CHange of OCcurence of events in multivariate time series

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.width=16/2.54,
  fig.height=10/2.54
)
library(chocR)
data(res_choc)
data(res_confid)

Introduction

This packages aims at carrying out a change of occurence analysis on multivariate time-series. If we have a time-series X (for example, daily observations of temperature) and a time-series Y (for example, daily observations of river discharge) measured at the same time step, it can be interesting to detect whether some associations of X and Y become less frequent or more frequent through time (for example, an increase of the occurence of low discharge associated with high temperature means that drought becomes more frequent). To carry out the analysis, choc splits the dataset in sub-periods (for example years) with serveral observations. We recommend that the sup-periods have similar duration and similar number of observations. Then, a kernel estimator is fitted on each sub-period, so that a density of probability can be estimated for any point and period. Densities of probability are estimated on each perdiod p and each point of a regularly spaced grid so that for each point of the grid, we have a vector {D1,D2,...,Dp} of densities of probabilies other the p periods. A Kendall rank correlation coefficient is then estimated on each of this vectors.

Here is an example on simulated data set:

library(MASS)
library(ks)
tvar <- rep(1:40,times=100) #times steps
meansX <-tvar/40 #trend on 1st variable
meansY <- -0.5*tvar/40 #trend on 2nd variable
sigma <- matrix(c(1,.1,.1,1),2,2) #covariance matrix
values <- t(apply(cbind(meansX,meansY),1,function(mu) mvrnorm(1,mu,sigma))) #generate the values
H <- Hpi #choose the default bandwith
res_choc <- choc(values,H,tvar)

head(res_choc$grid)

The first lines aims at generating an artificial dataset, then a method is chosen to select the bandwith of the kernel estimator. The choc analysis is then carried out. res$grid provides for a set of points the corresponding Kendall rank correlation coefficient to indicate whether the temporal trend is positive or negative.

We also provide a method to assess whether these trends are significant or not. We provide to different methods. The method "kern" is based on simulations of surrogate data. If we assumed that there is not trend in the data, we can fit a kernel on the whole data-set and use this kernel to simulate data having the same structure than the original data set (same number of observations per period), and we can compute corresponding Kendall coefficient on the surrogate data sets. By doing this, we generate distribution of Kendall coefficient under H0 and therefore, we can assess the significance of our the coefficient on the observed data set. We should be aware that, by doing this, we totally ignore a potential autocorrelation (such as seasonality) within period. To overcome this limitation, we provide a second method "perm". The method is based on permutation: we permutate periods leading to vectors such as {D20,D12,Dp,...D1,D40} and estimate corresponding Kendall coefficient. Once again, this provides a distribution of Kendall coefficient under H0 (no trend so that periods can be permuted).

#res_confid <- estimate_confidence(res_choc,"perm",0.95,1000) #be aware that this make take several hours

g<-plot_choc(res_confid)
print(g)

In this example, we used a permutation method with 1000 replicates to estimate the significance of Kendall coefficients. A function is then provided to plot results. Here we illustrate a choc analysis in two dimensions, but it can be extended to more dimensions.

We saw on the plot that the increasing trend in X and the declining trend in Y leads to a rarefaction of events with simultanous high Y and low X and, conversely, a combination of high X and low Y tend become more frequent (regarding our example on river discharge and temperature, this would correspond to an increase of drought period).

Irstea/chocR documentation built on June 12, 2022, 8:28 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

Irstea/chocR
Exploring the temporal CHange of OCcurence of events in multivariate time series

In Irstea/chocR: Exploring the temporal CHange of OCcurence of events in multivariate time series

Introduction

R Package Documentation

Browse R Packages

We want your feedback!

Irstea/chocR Exploring the temporal CHange of OCcurence of events in multivariate time series

In Irstea/chocR: Exploring the temporal CHange of OCcurence of events in multivariate time series

Introduction

R Package Documentation

Browse R Packages

We want your feedback!

Irstea/chocR
Exploring the temporal CHange of OCcurence of events in multivariate time series