Exploring Pattern Causality: An Intuitive Guide
In patterncausality: Pattern Causality Algorithm

knitr::opts_chunk$set(
  warning = FALSE,
  collapse = TRUE,
  comment = "#>"
)

Pattern Causality is a novel approach for detecting and analyzing causal relationships within complex systems. It is particularly effective at:

Identifying hidden patterns in time series data: Uncovering underlying regularities in seemingly random data.
Quantifying different types of causality: Distinguishing between positive, negative, and "dark" causal influences, which are often difficult to observe directly.
Providing robust statistical analysis: Ensuring the reliability of causal inferences through rigorous statistical methods.

This algorithm is especially useful in the following domains:

Financial market analysis: Understanding the interdependencies between different assets.
Climate system interactions: Revealing the complex relationships between various factors in climate change.
Medical diagnosis: Assisting in the analysis of disease progression patterns to identify potential causes.
Complex system dynamics: Studying the interactions between components in various complex systems.

Application in Financial Markets

First, we import stock data for Apple (AAPL) and Microsoft (MSFT). You can also import data using the yahooo API.

library(patterncausality)
data(DJS)
head(DJS)

The data(DJS) function loads the Dow Jones stocks dataset, which includes daily stock prices for several companies, including Apple and Microsoft. The head(DJS) function displays the first few rows of the dataset.

Next, we visualize the stock prices to observe their trends more intuitively.

library(ggplot2)
library(ggthemes)
df <- data.frame(
  Date = as.Date(DJS$Date),
  Value = c(
    DJS$Apple,
    DJS$Microsoft
  ),
  Type = c(
    rep("Apple", dim(DJS)[1]),
    rep("Microsoft", dim(DJS)[1])
  )
)
ggplot(df) +
  geom_line(aes(Date, Value, group = Type, colour = Type), linewidth = 0.4) +
  theme_few(base_size = 12) +
  xlab("Time") +
  ylab("Stock Price") +
  theme(
    legend.position = "bottom", legend.box.background = element_rect(fill = NA, color = "black", linetype = 1), legend.key = element_blank(),
    legend.title = element_blank(), legend.background = element_blank(), axis.text = element_text(size = rel(0.8)),
    strip.text = element_text(size = rel(0.8))
  ) +
  scale_color_manual(values = c("#DC143C", "#191970"))

Parameter Optimization

To achieve the best results with pattern causality analysis, it's important to find the optimal parameters. The following code demonstrates how to search for these parameters using the optimalParametersSearch function (note that this code is not evaluated by default due to the time it may take).

dataset <- DJS[,-1]
parameter <- optimalParametersSearch(Emax = 5, tauMax = 5, metric = "euclidean", dataset = dataset)

Calculating Causality

After determining the parameters, we can calculate the causality between the two series.

X <- DJS$Apple
Y <- DJS$Microsoft
pc <- pcLightweight(X, Y, E = 3, tau = 2, metric = "euclidean", h = 1, weighted = TRUE)
print(pc)

The pcLightweight function performs a lightweight pattern causality analysis.

The print(pc) method displays the results of the causality analysis, including the total, positive, negative, and dark causality percentages.

Finally, we can visualize the results to better understand the causal relationships using the plot_total and plot_components functions.

plot_total(pc)
plot_components(pc)

The plot_total function visualizes the total causality, and plot_components visualizes the different components of causality (positive, negative, and dark).

Detailed Analysis

For more detailed causality information, use the pcFullDetails function.

X <- DJS$Apple
Y <- DJS$Microsoft
detail <- pcFullDetails(X, Y, E = 3, tau = 2, metric = "euclidean", h = 1, weighted = TRUE)
# Access the causality components
causality_real <- detail$causality_real
causality_pred <- detail$causality_pred
print(causality_pred)

This completes the entire process of the pattern causality algorithm.

Parameter Description

The pattern causality functions accept several important parameters:

E: Embedding dimension (integer > 1)
tau: Time delay (integer > 0)
metric: Distance metric ("euclidean", "manhattan", or "maximum")
h: Prediction horizon (integer >= 0)
weighted: If TRUE, uses weighted causality strength calculation
relative: If TRUE, calculates relative changes ((new-old)/old) instead of absolute changes (new-old) in signature space. Default is TRUE.

The combination of weighted = TRUE and relative = TRUE is particularly useful when: 1. You want to analyze percentage changes rather than absolute changes 2. The time series have different scales or units 3. You're interested in the relative importance of changes

# Example with both weighted and relative TRUE
pc_rel_weighted <- pcLightweight(X, Y, E = 3, tau = 2, metric = "euclidean", 
                               h = 1, weighted = TRUE, relative = TRUE)
print(pc_rel_weighted)