query_distribution: Calculate query distribution
In CausalQueries: Make, Update, and Query Binary Causal Models

query_distribution

R Documentation

Calculate query distribution

Description

Calculated distribution of a query from a prior or posterior distribution of parameters

Usage

query_distribution(
  model,
  queries = NULL,
  given = NULL,
  using = "parameters",
  parameters = NULL,
  n_draws = 4000,
  join_by = "|",
  case_level = FALSE,
  query = NULL
)

Arguments

`model`	A `causal_model`. A model object generated by `make_model`.
`queries`	A vector of strings or list of strings specifying queries on potential outcomes such as "Y[X=1] - Y[X=0]". Queries can also indicate conditioning sets by placing second queries after a colon: "Y[X=1] - Y[X=0] :\|: X == 1 & Y == 1". Note a ':\|:' is used rather than the traditional conditioning marker '\|' to avoid confusion with logical operators.
`given`	A character vector specifying given conditions for each query. A 'given' is a quoted expression that evaluates to logical statement. `given` allows the query to be conditioned on either observed or counterfactural distributions. A value of TRUE is interpreted as no conditioning. A given statement can alternatively be provided after a colon in the query statement.
`using`	A character. Whether to use priors, posteriors or parameters
`parameters`	A vector or list of vectors of real numbers in [0,1]. A true parameter vector to be used instead of parameters attached to the model in case `using` specifies `parameters`
`n_draws`	An integer. Number of draws.rm
`join_by`	A character. The logical operator joining expanded types when `query` contains wildcard (`.`). Can take values `"&"` (logical AND) or `"\|"` (logical OR). When restriction contains wildcard (`.`) and `join_by` is not specified, it defaults to `"\|"`, otherwise it defaults to `NULL`.
`case_level`	Logical. If TRUE estimates the probability of the query for a case.
`query`	alias for queries

Value

A data frame where columns contain draws from the distribution of the potential outcomes specified in query

Examples

model <- make_model("X -> Y") |>
         set_parameters(c(.5, .5, .1, .2, .3, .4))
 
 # simple  queries
 query_distribution(model, query = "(Y[X=1] > Y[X=0])", using = "priors") |>
   head()

 # multiple  queries
 query_distribution(model,
     query = list(PE = "(Y[X=1] > Y[X=0])", NE = "(Y[X=1] < Y[X=0])"),
     using = "priors")|>
   head()

 # multiple queries and givens, with ':' to identify conditioning distributions
 query_distribution(model,
   query = list(POC = "(Y[X=1] > Y[X=0]) :|: X == 1 & Y == 1",
                Q = "(Y[X=1] < Y[X=0]) :|: (Y[X=1] <= Y[X=0])"),
   using = "priors")|>
   head()

 # multiple queries and givens, using 'given' argument
 query_distribution(model,
   query = list("(Y[X=1] > Y[X=0])", "(Y[X=1] < Y[X=0])"),
   given = list("Y==1", "(Y[X=1] <= Y[X=0])"),
   using = "priors")|>
   head()

 # linear queries
 query_distribution(model, query = "(Y[X=1] - Y[X=0])")


 # Linear query conditional on potential outcomes
 query_distribution(model, query = "(Y[X=1] - Y[X=0]) :|: Y[X=1]==0")

 # Use join_by to amend query interpretation
 query_distribution(model, query = "(Y[X=.] == 1)", join_by = "&")

 # Probability of causation query
 query_distribution(model,
    query = "(Y[X=1] > Y[X=0])",
    given = "X==1 & Y==1",
    using = "priors")  |> head()

 # Case level probability of causation query
 query_distribution(model,
    query = "(Y[X=1] > Y[X=0])",
    given = "X==1 & Y==1",
    case_level = TRUE,
    using = "priors")

 # Query posterior
 update_model(model, make_data(model, n = 3)) |>
 query_distribution(query = "(Y[X=1] - Y[X=0])", using = "posteriors") |>
 head()

 # Case level queries provide the inference for a case, which is a scalar
 # The case level query *updates* on the given information
 # For instance, here we have a model for which we are quite sure that X
 # causes Y but we do not know whether it works through two positive effects
 # or two negative effects. Thus we do not know if M=0 would suggest an
 # effect or no effect

 set.seed(1)
 model <-
   make_model("X -> M -> Y") |>
   update_model(data.frame(X = rep(0:1, 8), Y = rep(0:1, 8)), iter = 10000)

 Q <- "Y[X=1] > Y[X=0]"
 G <- "X==1 & Y==1 & M==1"
 QG <- "(Y[X=1] > Y[X=0]) & (X==1 & Y==1 & M==1)"

 # In this case these are very different:
 query_distribution(model, Q, given = G, using = "posteriors")[[1]] |> mean()
 query_distribution(model, Q, given = G, using = "posteriors",
   case_level = TRUE)

 # These are equivalent:
 # 1. Case level query via function
 query_distribution(model, Q, given = G,
    using = "posteriors", case_level = TRUE)

 # 2. Case level query by hand using Bayes' rule
 query_distribution(
     model,
     list(QG = QG, G = G),
     using = "posteriors") |>
    dplyr::summarize(mean(QG)/mean(G))

CausalQueries documentation built on April 3, 2025, 7:46 p.m.