knitr::opts_chunk$set( warning = FALSE, collapse = TRUE, comment = "#>" )
Pattern Causality is a novel approach for detecting and analyzing causal relationships within complex systems. It is particularly effective at:
This algorithm is especially useful in the following domains:
First, we import stock data for Apple (AAPL) and Microsoft (MSFT). You can also import data using the yahooo API.
library(patterncausality) data(DJS) head(DJS)
The data(DJS)
function loads the Dow Jones stocks dataset, which includes daily stock prices for several companies, including Apple and Microsoft. The head(DJS)
function displays the first few rows of the dataset.
Next, we visualize the stock prices to observe their trends more intuitively.
library(ggplot2) library(ggthemes) df <- data.frame( Date = as.Date(DJS$Date), Value = c( DJS$Apple, DJS$Microsoft ), Type = c( rep("Apple", dim(DJS)[1]), rep("Microsoft", dim(DJS)[1]) ) ) ggplot(df) + geom_line(aes(Date, Value, group = Type, colour = Type), linewidth = 0.4) + theme_few(base_size = 12) + xlab("Time") + ylab("Stock Price") + theme( legend.position = "bottom", legend.box.background = element_rect(fill = NA, color = "black", linetype = 1), legend.key = element_blank(), legend.title = element_blank(), legend.background = element_blank(), axis.text = element_text(size = rel(0.8)), strip.text = element_text(size = rel(0.8)) ) + scale_color_manual(values = c("#DC143C", "#191970"))
To achieve the best results with pattern causality analysis, it's important to find the optimal parameters. The following code demonstrates how to search for these parameters using the optimalParametersSearch
function (note that this code is not evaluated by default due to the time it may take).
dataset <- DJS[,-1] parameter <- optimalParametersSearch(Emax = 5, tauMax = 5, metric = "euclidean", dataset = dataset)
After determining the parameters, we can calculate the causality between the two series.
X <- DJS$Apple Y <- DJS$Microsoft pc <- pcLightweight(X, Y, E = 3, tau = 2, metric = "euclidean", h = 1, weighted = TRUE) print(pc)
The pcLightweight
function performs a lightweight pattern causality analysis.
The print(pc)
method displays the results of the causality analysis, including the total, positive, negative, and dark causality percentages.
Finally, we can visualize the results to better understand the causal relationships using the plot_total
and plot_components
functions.
plot_total(pc) plot_components(pc)
The plot_total
function visualizes the total causality, and plot_components
visualizes the different components of causality (positive, negative, and dark).
For more detailed causality information, use the pcFullDetails
function.
X <- DJS$Apple Y <- DJS$Microsoft detail <- pcFullDetails(X, Y, E = 3, tau = 2, metric = "euclidean", h = 1, weighted = TRUE) # Access the causality components causality_real <- detail$causality_real causality_pred <- detail$causality_pred print(causality_pred)
This completes the entire process of the pattern causality algorithm.
The pattern causality functions accept several important parameters:
E
: Embedding dimension (integer > 1)tau
: Time delay (integer > 0)metric
: Distance metric ("euclidean", "manhattan", or "maximum")h
: Prediction horizon (integer >= 0)weighted
: If TRUE, uses weighted causality strength calculationrelative
: If TRUE, calculates relative changes ((new-old)/old) instead of absolute changes (new-old) in signature space. Default is TRUE.The combination of weighted = TRUE
and relative = TRUE
is particularly useful when:
1. You want to analyze percentage changes rather than absolute changes
2. The time series have different scales or units
3. You're interested in the relative importance of changes
# Example with both weighted and relative TRUE pc_rel_weighted <- pcLightweight(X, Y, E = 3, tau = 2, metric = "euclidean", h = 1, weighted = TRUE, relative = TRUE) print(pc_rel_weighted)
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.