# query_distribution: Calculate query distribution In CausalQueries: Make, Update, and Query Binary Causal Models

 query_distribution R Documentation

## Calculate query distribution

### Description

Calculated distribution of a query from a prior or posterior distribution of parameters

### Usage

``````query_distribution(
model,
queries,
given = NULL,
using = "parameters",
parameters = NULL,
n_draws = 4000,
type_distribution = NULL,
join_by = "|",
case_level = FALSE,
query = NULL
)
``````

### Arguments

 `model` A `causal_model`. A model object generated by `make_model`. `queries` A character vector or list of character vectors specifying queries on potential outcomes such as "Y[X=1] - Y[X=0]" `given` A character vector specifying givens for each query. A given is a quoted expression that evaluates to logical statement. `given` allows the query to be conditioned on *observational* distribution. A value of TRUE is interpreted as no conditioning. `using` A character. Whether to use priors, posteriors or parameters `parameters` A vector or list of vectors of real numbers in [0,1]. A true parameter vector to be used instead of parameters attached to the model in case `using` specifies `parameters` `n_draws` An integer. Number of draws. `type_distribution` A numeric vector or list of numeric vectors. If provided saves calculation, otherwise calculated from model; may be based on prior or posterior `join_by` A character. The logical operator joining expanded types when `query` contains wildcard (`.`). Can take values `"&"` (logical AND) or `"|"` (logical OR). When restriction contains wildcard (`.`) and `join_by` is not specified, it defaults to `"|"`, otherwise it defaults to `NULL`. `case_level` Logical. If TRUE estimates the probability of the query for a case. `query` alias for queries

### Value

A `DataFrame` where columns contain draws from the distribution of the potential outcomes specified in `query`

### Examples

``````model <- make_model("X -> Y") %>%
set_parameters(c(.5, .5, .1, .2, .3, .4))

# simple  queries
query_distribution(model, query = "(Y[X=1] > Y[X=0])", using = "priors") |>

# multiple  queries
query_distribution(model,
query = list("(Y[X=1] > Y[X=0])", "(Y[X=1] < Y[X=0])"), using = "priors")|>

# multiple queries and givens
query_distribution(model,
query = list("(Y[X=1] > Y[X=0])", "(Y[X=1] < Y[X=0])"),
given = list("Y==1", "(Y[X=1] <= Y[X=0])"),
using = "priors")|>

# linear queries
query_distribution(model, query = "(Y[X=1] - Y[X=0])")

# queries conditional on observables
query_distribution(model, query = "(Y[X=1] > Y[X=0])", given = "X==1 & Y ==1")

# Linear query conditional on potential outcomes
query_distribution(model, query = "(Y[X=1] - Y[X=0])", given = "Y[X=1]==0")

# Use join_by to amend query interpretation
query_distribution(model, query = "(Y[X=.] == 1)", join_by = "&")

# Probability of causation query
query_distribution(model,
query = "(Y[X=1] > Y[X=0])",
given = "X==1 & Y==1",

# Case level probability of causation query
query_distribution(model,
query = "(Y[X=1] > Y[X=0])",
given = "X==1 & Y==1",
case_level = TRUE,
using = "priors")

# Query posterior
update_model(model, make_data(model, n = 3)) |>
query_distribution(query = "(Y[X=1] - Y[X=0])", using = "posteriors") |>

# Case level queries provide the inference for a case, which is a scalar
# The case level query *updates* on the given information
# For instance, here we have a model for which we are quite sure that X causes Y but we do not
# know whether it works through two positive effects or two negative effects
# Thus we do not know if M=0 would suggest an effect or no effect

set.seed(1)
model <-
make_model("X -> M -> Y") |>
update_model(data.frame(X = rep(0:1, 8), Y = rep(0:1, 8)), iter = 10000)

Q <- "Y[X=1] > Y[X=0]"
G <- "X==1 & Y==1 & M==1"
QG <- "(Y[X=1] > Y[X=0]) & (X==1 & Y==1 & M==1)"

# In this case these are very different:
query_distribution(model, Q, given = G, using = "posteriors")[[1]] |> mean()
query_distribution(model, Q, given = G, using = "posteriors",
case_level = TRUE)

# These are equivalent:
# 1. Case level query via function
query_distribution(model, Q, given = G,
using = "posteriors", case_level = TRUE)

# 2. Case level query by hand using Bayes
distribution <- query_distribution(
model, list(QG = QG, G = G), using = "posteriors")

mean(distribution\$QG)/mean(distribution\$G)

``````

CausalQueries documentation built on Oct. 20, 2023, 1:06 a.m.