descriptor: compute descriptor In D2C: Predicting Causal Direction from Dependency Features

Description

compute descriptor

Usage

 1 2 descriptor(D, ca, ef, ns = min(4, NCOL(D) - 2), lin = FALSE, acc = TRUE, struct = TRUE, pq = c(0.1, 0.25, 0.5, 0.75, 0.9), bivariate = FALSE)

Arguments

 D : the observed data matrix of size [N,n], where N is the number of samples and n is the number of nodes ca : node index (1 ≤ ca ≤ n) of the putative cause ef : node index (1 ≤ ef ≤ n) of the putative effect ns : size of the Markov Blanket lin : TRUE OR FALSE. if TRUE it uses a linear model to assess a dependency, otherwise a local learning algorithm acc : TRUE OR FALSE. if TRUE it uses the accuracy of the regression as a descriptor struct : TRUE or FALSE to use the ranking in the markov blanket as a descriptor pq : a vector of quantiles used to compute de descriptor bivariate : TRUE OR FALSE. if TRUE it includes the descriptors of the bivariate dependency

Details

This function is the core of the D2C algorithm. Given two candidate nodes, (ca, putative cause and ef, putative effect) it first infers from the dataset D the Markov Blankets of the variables indexed by ca and ef (MBca and MBef) by using the mimr algorithm (Bontempi, Meyer, ICML10). Then it computes a set of (conditional) mutual information terms describing the dependency between the variables ca and ef. These terms are used to create a vector of descriptors. If acc=TRUE, the vector contains the descriptors related to the asymmetric information theoretic terms described in the paper. If struct=TRUE, the vector contains descriptors related to the positions of the terms of the MBef in MBca and viceversa. The estimation of the information theoretic terms require the estimation of the dependency between nodes. If lin=TRUE a linear assumption is made. Otherwise the local learning estimator, implemented by the R package lazy, is used.

References

Gianluca Bontempi, Maxime Flauder (2014) From dependency to causality: a machine learning approach. Under submission

Bontempi G., Meyer P.E. (2010) Causal filter selection in microarray data. ICML'10

M. Birattari, G. Bontempi, and H. Bersini (1999) Lazy learning meets the recursive least squares algorithm. Advances in Neural Information Processing Systems 11, pp. 375-381. MIT Press.

G. Bontempi, M. Birattari, and H. Bersini (1999) Lazy learning for modeling and control design. International Journal of Control, 72(7/8), pp. 643-658.

D2C documentation built on May 29, 2017, 10:44 a.m.