mvDFA
In mvDFA: Multivariate Detrended Fluctuation Analysis

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

This R package provides an implementation of multivariate extensions of a well-known fractal analysis technique, Detrended Fluctuations Analysis (DFA; Peng et al., 1995, ), for multivariate time series: multivariate DFA (mvDFA). Several coefficients are implemented that take into account the correlation structure of the multivariate time series to varying degrees. These coefficients may be used to analyze long memory and changes in the dynamic structure that would by univariate DFA. Therefore, this R package aims to extend and complement the original univariate DFA (Peng et al., 1995) for estimating the scaling properties of nonstationary time series.

Install the package

Install from CRAN:

install.packages("mvDFA")

Install the latest working version from GitHub

This requires the package devtools. The following code installs devtools, if it is not installed.

if(!"devtools" %in% installed.packages()) install.packages("devtools")
devtools::install_github("jpirmer/mvDFA")

Load the package:

library(mvDFA)

Test functions for approximated correlated Time Series with different Hurst Exponent

Simulate correlated time series

Choose a covariance matrix $\Sigma$

Sigma <- Sigma <- matrix(.5, 4, 4)
diag(Sigma) <- 1
Sigma[3,4] <- Sigma[4,3] <- -.3
Sigma

Simulate from the covariance matrix with approximate univariate Hurst-exponents $H:=(.2, .5, .7, .9)'$.

set.seed(2023)
X <- simulate_cMTS(N = 10^3, H = c(.2, .5, .7, .9), 
                   Sigma = Sigma, simulation_process = "FGN0", 
                   decomposition = "chol", cor_increments = FALSE)
head(X)

simulation_process = "FGN0" uses longmemo::simFGN0 to generate independent (Fractional) Gaussian Processes, which are then mixed using the Cholesky decomposition $D$ of $\Sigma := DD'$. Changing this argument to simulation_process = "FGN.fft" uses longmemo::simFGN.fft (using FFT) to generate independent (Fractional) Gaussian Processes, which are then mixed using the Cholesky decomposition $D$ of $\Sigma := DD'$ via decomposition = "chol". The differences may be inspected here:

x1 <- simulate_cMTS(N = 3*10^2, H = c(.5), Sigma = as.matrix(1), 
                    simulation_process = "FGN0",
                    cor_increments = FALSE)
plot(x1$X1, main = "H = 0.5 and FGN0", type = "l")
x2 <- simulate_cMTS(N = 3*10^2, H = c(.5), Sigma = as.matrix(1), 
                    simulation_process = "FGN.fft",
                    cor_increments = FALSE)
plot(x2$X1, main = "H = 0.5 and FGN.fft", type = "l")

Note: the true Hurst Exponents might be slightly off, since the mixing (weighted sums) of the time series makes them slightly more normal (due to the central limits theorem). Hence, small Hurst-exponents ($H<0.5$) tend to be upwards biased, while larger Hurst-exponents tend to be downward biased ($H>0.5$).

Further, the use of Cholesky might influence the generation of data in a different way than another decomposition of $\Sigma$ would. An alternative is the eigen-value (or singular value) decomposition via decomposition = "eigen". Further research is needed!

Finally, we did not correlated the increments but the whole time series (cor_increments = FALSE), if instead increments should be correlated use cor_increments = FALSE.

Estimate multivariate extensions of Hurst exponent for correlated time series

The mvDFA function needs only the time series in long format as a matrix or data.frame object. Multiple cores can be used (maximum is parallel::detectCores()), which reduces computation time drastically for longer timer series. The steps argument manipulates the number of window sizes, which are separated logarithmically. Larger numbers of steps result in more precise estimates of the extended Hurst exponents, but require more computation time. The degree option specifies the degree of the polynomial of the time trend used for detrending.

mvDFA_result <- mvDFA(X = X, steps = 50, cores = 1, degree = 1)
mvDFA_result

The slope of the log-log regression of the root-mean square fluctuations $RMS$ vs. window size $s$ gives an estimate for the long memory coefficient $L$ within the data. The output argument Ltot corresponds to the total variance approach, which uses root-mean of the sum of the diagonal elements of the covariance matrix $C_{\nu,s}$ of the detrended fluctuation object $Y_{\nu,s}$ within each window to compute the corresponding root mean square fluctuation $RMS_\text{tot}(\nu,s)$ averaged across all the subsequences $\nu$ of length $s$. Lgen corresponds to the generalized variance approach, which uses root-mean of the determinant of the covariance matrix $C_{\nu,s}$ of the detrended fluctuation object $Y_{\nu,s}$ within each window to compute the corresponding root mean square fluctuation $RMS_\text{tot}(\nu,s)$ averaged across all the subsequences $\nu$ of length $s$. Further, we output the corresponding coefficient for all variances and covariances individually (see Lfull), averaged across all univariate time series (see LmeanUni) and done for each univariate time series individually (see univariate_DFA).

Further arguments hidden in the primary output are the corresponding $R^2$ per analysis approach:

mvDFA_result[6:10]

Further, the used RMS objects and the used window sizes can be extracted using the following arguments:

mvDFA_result$S
mvDFA_result$RMS_tot
mvDFA_result$RMS_gen
head(mvDFA_result$CovRMS_s)

which may be used to compute the Hurst exponents by hand or to display the log-log-plots. We present this for the total and the generalized approach:

# total
df_tot <- data.frame(mvDFA_result[c("S", "RMS_tot")])
reg_tot <- lm(I(log10(RMS_tot)) ~ 1 + I(log10(S)), data = df_tot)
coef(reg_tot)[2]
mvDFA_result$Ltot

plot(log10(df_tot))
abline(reg_tot)

# generalized
df_gen <- data.frame(mvDFA_result[c("S", "RMS_gen")])
reg_gen <- lm(I(log10(RMS_gen)) ~ 1 + I(log10(S)), data = df_gen)
coef(reg_gen)[2]/4
mvDFA_result$Lgen

plot(log10(df_gen))
abline(reg_gen)

References

Peng, C. K., Havlin, S., Stanley, H. E., & Goldberger, A. L. (1995). Quantification of scaling exponents and crossover phenomena in nonstationary heartbeat time-series. Chaos, 5, 82–87.