-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Asymmetric Causal Shapley values #395
Closed
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
… all, but they show which feature combinations that are actually used in the iterative sampling steps in the causal setup.
…index_explixcands.
…ve NULLs and duplicate of integer(0).
…mean of grouped features. Otherwise, we cannot make a beeswarm plot for grouped Shapley values
…lues both in the conditional and in the causal framework
10 tasks
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
In this PR, we add support for computing asymmetric and/or causal Shapley values. The asymmetric version can use all approaches, while the causal version is limited to the Monte Carlo-based approaches. The implementation is an extension of #273 (but this PR was restricted to the
gaussian
approach and the old version ofshapr
), which was adapted from the package CauSHAPley.Asymmetric Shapley values were proposed by Frye et al. (2020) as a way to incorporate causal knowledge in the real world by restricting the possible permutations of the features when computing the Shapley values to those consistent with a (partial) causal ordering.
Causal Shapley values were proposed by Heskes et al. (2020) as a way to explain the total effect of features on the prediction, taking into account their causal relationships, by adapting the sampling procedure in shapr.
The two ideas can be combined to obtain asymmetric causal Shapley values. For more details, see Heskes et al. (2020).
Usage: (Assume
N_features = 7
)(Symmetric) Conditional Shapley values:
asymmetric = FALSE
(default),causal_ordering = list(1:7)
(default), andconfounding = FALSE
(default)Marginal Shapley values: either 1) the same as above, but set approach = independence, or 2)
asymmetric = FALSE
(default),causal_ordering = list(1:7)
(default), andconfounding = TRUE
.Asymmetric conditional Shapley values with respect to a specific ordering:
asymmetric = TRUE
,causal_ordering = list(1, c(2, 3), 4:7)
, andconfounding = FALSE
(default).Causal Shapley values (compute all coalitions, but chains of sampling steps):
asymmetric = FALSE
(default),causal_ordering = list(1, c(2, 3), 4:7)
, andconfounding = c(FALSE, TRUE, FALSE)
.Asymmetric Causal Shapley values (compute only coalitions respecting the ordering and chains of sampling steps):
asymmetric = TRUE
,causal_ordering = list(1, c(2, 3), 4:7)
, andconfounding = c(FALSE, TRUE, FALSE)
.Main differences:
The user now has the option to specify
asymmetric
,causal_ordering
, andconfounding
in theexplain
function.The first argument,
asymmetric
, specifies if we are to consider all feature combinations/coalitions, or only the combinations that respect the (partial) causal ordering given bycausal_ordering
. The second argument,causal_ordering
is a list specifying the (partial) causal ordering of the features (groups), i.e.,causal_ordering = list(1:3, 4:5)
, which implies that features one to three are the ancestors of four and five. The third argument,confounding
specifies if the user assumes that each component is subject to confounding or not, e.g.,causal_confounding = c(FALSE, TRUE)
. Note that practitioners are responsible for correctly identifying the causal structures.When the
causal_ordering
is notlist(1:N_features)
, then we have a causal structure that implies that some coalitions/feature combinations will not respect the order. For example, we cannot have a combination that conditions/includes feature four and not all of the features one to three in the setting above, as they are feature four's ancestors. Ifasymmetric = TRUE
, then we only use the combinations that respect the order. Ifasymmetric = FALSE
, then we use all combinations. Furthermore, generating the MC samples for each valid coalition will introduce a chain of sampling steps, which will be influenced by theconfounding
argument.That is, if
S = {2}
, we would in the first step (assumingconfounding = c(FALSE, TRUE)
) sampleX1, X3 | X2
, and in the second step, we would sampleX4, X5 | X1, X2, X3
. The confounding changes whether to include the features in the same component as conditional features or not, as Heskes et al. (2020) explained. Also, see examples inget_S_causal()
for demonstrations of how changing the confounding assumption changes the data generation steps.To reuse most of the
shapr
code, we iteratively callprepare_data()
with different values ofS
to generate the data. This introduces a lot of redundant computations, as we then generateX1, X3, X4, X5 | X2
in the first step, but throw awayX4
andX5
. To only generate MC samples for the relevant features, we would have to rewrite allprepare_data.approach
functions to also take in a Sbar argument as they currently assume that Sbar is all features not in S.The
independence,
empirical
, andctree
approaches can not necessarily generaten_samples
but rather weigh the samples. It is not obvious how to combine these weights in an interactive sampling process. We solve it by sampling the samplesn_samples
time using the weights. This means that we will have duplicates, which introduces extra computations.TODO:
O(2^C)
, where C is the number of features in the largest component in the causal ordering.prepare_data.approach
functions must be rewritten.?get_S_causal
for some examples of chains. E.g., we can havec("2|", "3,4|1,2", "5|1,2,3,4")
andc("1|", "3,4|1,2", "5|1,2,3,4")
. Could then ideally save some time by computing the rest together to minimize the number of times we have to recompute/model the same conditional distributions (the last step is often identical for all combinations).References: