View source: R/asymmetric_and_casual_Shapley.R
create_marginal_data_cat | R Documentation |
This function is used when we generate marginal data for the categorical approach when we have several sampling
steps. We need to treat this separately, as we here in the marginal step CANNOT make feature values such
that the combination of those and the feature values we condition in S are NOT in
categorical.joint_prob_dt
. If we do this, then we cannot progress further in the chain of sampling
steps. E.g., X1 in (1,2,3), X2 in (1,2,3), and X3 in (1,2,3).
We know X2 = 2, and let causal structure be X1 -> X2 -> X3. Assume that
P(X1 = 1, X2 = 2, X = 3) = P(X1 = 2, X2 = 2, X = 3) = 1/2. Then there is no point
generating X1 = 3, as we then cannot generate X3.
The solution is only to generate the values which can proceed through the whole
chain of sampling steps. To do that, we have to ensure the the marginal sampling
respects the valid feature coalitions for all sets of conditional features, i.e.,
the features in features_steps_cond_on
.
We sample from the valid coalitions using the MARGINAL probabilities.
create_marginal_data_cat(
n_MC_samples,
x_explain,
Sbar_features,
S_original,
joint_prob_dt
)
n_MC_samples |
Positive integer.
For most approaches, it indicates the maximum number of samples to use in the Monte Carlo integration
of every conditional expectation.
For |
x_explain |
Matrix or data.frame/data.table. Contains the the features, whose predictions ought to be explained. |
Sbar_features |
Vector of integers containing the features indices to generate marginal observations for.
That is, if |
S_original |
Vector of integers containing the features indices of the original coalition |
For undocumented arguments, see setup_approach.categorical()
.
Data table of dimension (`n_MC_samples` * `nrow(x_explain)`) \times `length(Sbar_features)`
with the
sampled observations.
Lars Henry Berge Olsen
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.