causal_direction: Determine the causal direction between 2 variables

Description Usage Arguments Details Value Examples

View source: R/causal_direction.R

Description

causal_direction determines the causal direction between 2 variables based on input measurements assuming a causal relationship exists and there are no hidden confounders.

Usage

1
causal_direction(vec_1, vec_2, continuous_thresh, discrete_thresh)

Arguments

vec_1

Measurements of the first variable

vec_2

Measurements of the second variable numeric variables can be costly in time. For this reason one can cap the number of measurements used for it using this argument.

continuous_thresh

minimum absolute sum magnitude required to re-orient a continuous-continuous pair edge

discrete_thresh

minimum absolute distance correlation magnitude required to re-orient a discrete-continuous/discrete pair edge

Details

Depending on the 2 variables encoding (each is either numeric or discrete) a specific method is dispatched to determine the causal direction between them. When the 2 variables are continuous, we can use several the general correlation measure and related criteria by calling some0pairs (see also Vinod 2017)

When the 2 variables are discrete, we can use the distance correlation measure by calling dcor (see also Liu and Chan 2016).

When one of the variables is discrete, and the other is continuous we can discretisize the continuous variable by calling discretize and use the method for two discrete variables.

Value

A string denoting whether vec_1 causes vec_2 or vice versa

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
library(orientDAG)
library(dagitty)
library(simMixedDAG)
library(carData)
library(bnlearn)

# load dataset and define underlying DAG
data("GSSvocab")
GSSvocab <- GSSvocab %>%
  filter(complete.cases(.)) %>%
  mutate(year = as.numeric(as.character(year)))

true_dag_dagitty <- dagitty("dag{
                            age -> educGroup;
                            age -> nativeBorn;
                            nativeBorn -> ageGroup;
                            nativeBorn -> vocab;
                            educ -> age;
                            educ -> gender;
                            educ -> year;
                            vocab -> gender;
                            vocab -> year
                            }")

# DAG adjacency matrix representation for distance calculations
true_dag <- dagitty_to_adjmatrix(true_dag_dagitty)

# Fit a non-parametric DAG model 
non_param_dag_model <- non_parametric_dag_model(true_dag_dagitty, GSSvocab)

# Generate a dataset from the above model
sim_data <- sim_mixed_dag(non_param_dag_model, N = 20000)

# First pass - estimate DAG using bnlearn::tabu function
est_dag <- tabu(sim_data)
est_dag <- bn_to_adjmatrix(est_dag)
est_dag <- est_dag[
  match(rownames(true_dag), rownames(est_dag)),
  match(colnames(true_dag), colnames(est_dag))
  ]
tabu_dist <- dag_dist(true_dag, est_dag, distance_measure = "sid")
tabu_dist

# Improve on our first pass by re-orienting edges using the orient_dag function

est_dag_orient_dag <- orient_dag(
  adjmatrix = est_dag,
  x = sim_data, 
  max_continuous_pairs_sample = 5000) # continuous pairs re-orientation takes time so sample size is kept small)
orient_dag_dist <- orientDAG::dag_dist(true_dag, est_dag_orient_dag, distance_measure = "sid")
orient_dag_dist

IyarLin/orientDAG documentation built on Jan. 23, 2020, 3:43 a.m.