cbl | R Documentation |
This function performs the confounder blanket learner (CBL) algorithm for causal discovery.
cbl( x, z, s = "lasso", B = 50, gamma = 0.5, maxiter = NULL, params = NULL, parallel = FALSE, ... )
x |
Matrix or data frame of foreground variables. |
z |
Matrix or data frame of background variables. |
s |
Feature selection method. Includes native support for sparse linear
regression ( |
B |
Number of complementary pairs to draw for stability selection. Following Shah & Samworth (2013), we recommend leaving this fixed at 50. |
gamma |
Omission threshold. If either of two foreground variables is
omitted from the model for the other with frequency |
maxiter |
Maximum number of iterations to loop through if convergence is elusive. |
params |
Optional list to pass to |
parallel |
Compute stability selection subroutine in parallel? Must
register backend beforehand, e.g. via |
... |
Extra parameters to be passed to the feature selection subroutine. |
The CBL algorithm (Watson & Silva, 2022) learns a partial order over
foreground variables x
via relations of minimal conditional
(in)dependence with respect to a set of background variables z
. The
method is sound and complete with respect to a so-called "lazy oracle", who
only answers independence queries about variable pairs conditioned on the
intersection of their respective non-descendants.
For computational tractability, CBL performs conditional independence tests
via supervised learning with feature selection. The current implementation
includes support for sparse linear models (s = "lasso"
) and gradient
boosting machines (s = "boost"
). For statistical inference, CBL uses
complementary pairs stability selection (Shah & Samworth, 2013), which bounds
the probability of errors of commission.
A square, lower triangular ancestrality matrix. Call this matrix m
.
If CBL infers that X_i \prec X_j, then m[j, i] = 1
. If CBL
infers that X_i \preceq X_j, then m[j, i] = 0.5
. If CBL infers
that X_i \sim X_j, then m[j, i] = 0
. Otherwise,
m[j, i] = NA
.
Watson, D.S. & Silva, R. (2022). Causal discovery under a confounder blanket. To appear in Proceedings of the 38th Conference on Uncertainty in Artificial Intelligence. arXiv preprint, 2205.05715.
Shah, R. & Samworth, R. (2013). Variable selection with error control: Another look at stability selection. J. R. Statist. Soc. B, 75(1):55–80, 2013.
# Load data data(bipartite) x <- bipartite$x z <- bipartite$z # Set seed set.seed(123) # Run CBL cbl(x, z) # With user-supplied feature selection subroutine s_new <- function(x, y) { # Fit model, extract coefficients df <- data.frame(x, y) f_full <- lm(y ~ 0 + ., data = df) f_reduced <- step(f_full, trace = 0) keep <- names(coef(f_reduced)) # Return bit vector out <- ifelse(colnames(x) %in% keep, 1, 0) return(out) } cbl(x, z, s = s_new)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.