title: "BiCausality: Binary Causality Inference Framework"
author: " C. Amornbunchornvej"
date: "r Sys.Date()
"
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{BiCausality_demo}
%\VignetteEngine{knitr::knitr}
\usepackage[utf8]{inputenc}
In the first step, we generate a simulation dataset as an input.
seedN<-2022 n<-200 # 200 individuals d<-10 # 10 variables mat<-matrix(nrow=n,ncol=d) # the input of framework #Simulate binary data from binomial distribution where the probability of value being 1 is 0.5. for(i in seq(n)) { set.seed(seedN+i) mat[i,] <- rbinom(n=d, size=1, prob=0.5) } mat[,1]<-mat[,2] | mat[,3] # 1 causes by 2 and 3 mat[,4] <-mat[,2] | mat[,5] # 4 causses by 2 and 5 mat[,6] <- mat[,1] | mat[,4] # 6 causes by 1 and 4
We use the following function to infer whether X causes Y.
# Run the function library(BiCausality) resC<-BiCausality::CausalGraphInferMainFunc(mat = mat,CausalThs=0.1, nboot =50, IndpThs=0.05)
The result of the adjacency matrix of the directed causal graph is below:
resC$CausalGRes$Ehat
The value in the element EValHat[i,j] represents that i causes j if the value is not zero. For example, EValHat[2,1] = 1 implies node 2 causes node 1, which is correct since node 1 have nodes 2 and 3 as causal nodes.
The directed causal graph also can be plot using the code below.
library(igraph) net <- graph_from_adjacency_matrix(resC$CausalGRes$Ehat ,weighted = NULL) plot(net, edge.arrow.size = 0.3, vertex.size =20 , vertex.color = '#D4C8E9',layout=layout_with_kk)
For the causal relation of variables 2 and 1, we can use the command below to see further information.
**Note that the odd difference between X and Y denoted oddDiff(X,Y) is define as |P (X = 1, Y = 1) P (X = 0, Y = 0) −P (X = 0, Y = 1) P (X = 1, Y = 0)|. If X is directly proportional to Y, then oddDiff(X,Y) is close to 1. If X is inverse of Y, then oddDiff(X,Y) is close to -1. If X and Y have no association, then oddDiff(X,Y) is close to zero.
resC$CausalGRes$causalInfo[['2,1']]
Below are the details of result explanation.
#This value represents the 95th percentile confidence interval of P(Y=1|X=1). $CDirConfValInv 2.5% 97.5% 1 1 #This value represents the 95th percentile confidence interval of |P(Y=1|X=1) - P(X=1|Y=1)|. $CDirConfInv 2.5% 97.5% 0.3217322 0.4534494 #This value represents the mean of |P(Y=1|X=1) - P(X=1|Y=1)|. $CDirmean [1] 0.3787904 #The test that has the null hypothesis that |P(Y=1|X=1) - P(X=1|Y=1)| below #or equal the argument of parameter "CausalThs" and the alternative hypothesis #is that |P(Y=1|X=1) - P(X=1|Y=1)| is greater than "CausalThs". $testRes2 Wilcoxon signed rank test with continuity correction data: abs(bCausalDirDist) V = 1275, p-value = 3.893e-10 alternative hypothesis: true location is greater than 0.1 #The test that has the null hypothesis that |oddDiff(X,Y)| below #or equal the argument of parameter "IndpThs" and the alternative hypothesis is #that |oddDiff(X,Y)| is greater than "IndpThs". $testRes1 Wilcoxon signed rank test with continuity correction data: abs(bSignDist) V = 1275, p-value = 3.894e-10 alternative hypothesis: true location is greater than 0.05 #If the test above rejects the null hypothesis with the significance threshold #alpha (default alpha=0.05), then the value "sign=1", otherwise, it is zero. $sign [1] 1 #This value represents the 95th percentile confidence interval of oddDiff(X,Y) $SignConfInv 2.5% 97.5% 0.08670325 0.13693900 #This value represents the mean of oddDiff(X,Y) $Signmean [1] 0.1082242
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.