Asymmetric causal Shapley values with adaptive sampling #400

LHBO · 2024-10-04T16:20:59Z

Extension of PR #395 that builds on the adaptive sampling introduced in PR #396. As the latter PR completely rewrote all of the main functions in shapr, I found it easier to start on a new one as there were a lot of merge conflicts when trying to merge #396 into #395.

In this PR, we add support for computing asymmetric and/or causal Shapley values. The asymmetric version can use all approaches, while the causal version is limited to the Monte Carlo-based approaches. The implementation is an extension of #273 (but this PR was restricted to the gaussian approach and the old version of shapr), which was adapted from the package CauSHAPley.

Asymmetric Shapley values were proposed by Frye et al. (2020) as a way to incorporate causal knowledge in the real world by restricting the possible permutations of the features when computing the Shapley values to those consistent with a (partial) causal ordering.

Causal Shapley values were proposed by Heskes et al. (2020) as a way to explain the total effect of features on the prediction, taking into account their causal relationships, by adapting the sampling procedure in shapr.

The two ideas can be combined to obtain asymmetric causal Shapley values. If you would like more details, you can see Heskes et al. (2020).

Usage: (Assume N_features = 7)
(Symmetric) Conditional Shapley values: asymmetric = FALSE (default), causal_ordering = list(1:7) (default), and confounding = FALSE (default)

Marginal Shapley values: either 1) the same as above, but set approach = independence, or 2) asymmetric = FALSE (default), causal_ordering = list(1:7) (default), and confounding = TRUE.

Asymmetric conditional Shapley values with respect to a specific ordering: asymmetric = TRUE, causal_ordering = list(1, c(2, 3), 4:7), and confounding = FALSE (default).

Causal Shapley values (compute all coalitions, but chains of sampling steps): asymmetric = FALSE (default), causal_ordering = list(1, c(2, 3), 4:7), andconfounding = c(FALSE, TRUE, FALSE).

Asymmetric Causal Shapley values (compute only coalitions respecting the ordering and chains of sampling steps): asymmetric = TRUE, causal_ordering = list(1, c(2, 3), 4:7), and confounding = c(FALSE, TRUE, FALSE).

Main differences:
The user now has the option to specify asymmetric, causal_ordering, and confounding in the explain function.
The first argument, asymmetric, specifies if we should consider all feature combinations/coalitions, or only the combinations that respect the (partial) causal ordering given by causal_ordering. The second argument, causal_ordering is a list specifying the (partial) causal ordering of the features (groups), i.e., causal_ordering = list(1:3, 4:5), which implies that features one to three are the ancestors of four and five. The third argument, confounding specifies if the user assumes that each component is subject to confounding or not, e.g., causal_confounding = c(FALSE, TRUE). So that you know, practitioners are responsible for correctly identifying the causal structures.

When the causal_ordering is not list(1:N_features), then we have a causal structure that implies that some coalitions/feature combinations will not respect the order. For example, we cannot have a combination that conditions/includes feature four and not all of the features one to three in the setting above, as they are feature four's ancestors. If asymmetric = TRUE, then we only use the combinations that respect the order. If asymmetric = FALSE, then we use all combinations. Furthermore, generating the MC samples for each valid coalition will introduce a chain of sampling steps, which the confounding argument influences.

That is, if S = {2}, we would in the first step (assuming confounding = c(FALSE, TRUE)) sample X1, X3 | X2, and in the second step, we would sample X4, X5 | X1, X2, X3. The confounding changes whether to include the features in the same component as conditional features or not, as Heskes et al. (2020) explained. Also, see examples in get_S_causal() for demonstrations of how changing the confounding assumption changes the data generation steps.

To reuse most of the shapr code, we iteratively call prepare_data() with different values of S to generate the data. This introduces a lot of redundant computations, as we then generate X1, X3, X4, X5 | X2 in the first step, but throw away X4 and X5. To only generate MC samples for the relevant features, we would have to rewrite all prepare_data.approach functions also to take in a Sbar argument as they currently assume that Sbar is all features not in S.

The independence, empirical, and ctree approaches can not necessarily generate n_samples but rather weigh the samples. Combining these weights in an interactive sampling process is not obvious. We solve it by sampling the samples n_samples time using the weights. This means that we will have duplicates, which introduces extra computations.

Plot:
Additionally, we have introduced the include_group_feature_means = FALSE argument in plot.shapr and plot_SV_several_approaches as some plots need to have a feature value, which we do not have for group-wise Shapley values. When TRUE, we use the average feature value among the features in each group (for each explicand).
In plot_SV_several_approaches, we also add the index_explicands_sort argument to decide the order of the explicands in the plots.

TODO:

Implement exact asymmetric causal feature Shapley values for all Monte Carlo-based approaches.
Implement support for non-exact. Need to figure out how to sample the allowed combinations and what weights to give them. I have a function that can create all valid combinations but still grows at O(2^C), where C is the number of features in the largest component in the causal ordering.
Implement support for groups. Restrict feature groups to be in the same component in the causal ordering.
Ctree is not very fast due to many inputs/MC samples. Can we somehow use the weights to speed it up?
Create a small vignette.
Make vignette runnable.
Add new tests that test that it works both in the adaptive and regular version

FUTURE:

FUTURE: Generate only the features in Sbar and not all features not in S (Since Sbar union S is not all features). To do this, all prepare_data.approach functions must be rewritten.
FUTURE: Be more clever in choosing the combinations to go into the different batches. Combinations that condition on features in the same components have similar chains of sampling steps. Often, only the first step is different. See ?get_S_causal for some examples of chains. E.g., we can have c("2|", "3,4|1,2", "5|1,2,3,4") and c("1|", "3,4|1,2", "5|1,2,3,4"). Could then ideally save some time by computing the rest together to minimize the number of times we have to recompute/model the same conditional distributions (the last step is often identical for all combinations).
FUTURE: The causal versions of Gaussian and copula can be written faster in C++ by sending the whole chain of sampling steps for each coalition, but then we would no longer have the same structure for all sampling methods.

References:

Heskes, T., Sijben, E., Bucur, I. G., & Claassen, T. (2020). Causal Shapley Values: Exploiting Causal Knowledge to Explain Individual Predictions of Complex Models. Advances in Neural Information Processing Systems, 33.
Frye, C., Rowat, C., & Feige, I. (2020). Asymmetric Shapley values: incorporating causal knowledge into model-agnostic explainability. Advances in Neural Information Processing Systems, 33.

Fix tests later

…_prepare_vS_MC_aux_causal` to `batch_prepare_vS_MC_auxiliary`.

…s, and rewritten for better clarity and less repetitions.

martinju

Done with the rest of the review. See the last minor comments. Not crucial to change the name.

R/shapley_setup.R

inst/extdata/day.csv

Merge remote-tracking branch 'refs/remotes/origin/CausalShapleyNew' into CausalShapleyNew # Please enter a commit message to explain why this merge is necessary, # especially if it merges an updated upstream into a topic branch. # # Lines starting with '#' will be ignored, and an empty message aborts # the commit.

…ified that the tests that previously failed now runs.

…eyNew

martinju

Let's say this is good now :-)

martinju · 2024-10-13T18:28:02Z

We're aware that the mac-os test fails for the categorical approach, where one observation gets a different shapley value, due to a change in dt_vS value of v(S={3}). Further debugging is required to figure this out, but it is not straight forward as it only happens on the GHA macOS version, not locally on mac. Would probably need to add a new test which with keep_vS_output = TRUE to figure out more precisely where this is occuring.

…ntral#400)

martinju added 30 commits August 3, 2024 23:13

lintr

151bba7

.

361d8f9

apply name changes to test files

12ca661

rename regular output name

a5c666b

adding setup adaptive ++

86fed31

update regular tests

410c05d

bugfix, improve printing and init adaptive tests

ae29313

update test files

34c0905

div

970d08d

remove timing arg and add hidden testing arg

4e319e7

fixing broken testing objects after updates

a81a22b

update tests with testing = TRUE, and remove timing = FALSE

8a4f0db

rds files

bb4e385

styler

643e5f0

[skip actions] .

24ae4d4

move functions to appropriate files

d0a1ad6

[skip actions] doc + temporary and hiddenly adding unique_sampling

854fee7

add timing + experiment with improved bootstrapping code

17c94ef

[skip actions] fix non-unique sampling

c7f3e2b

init moving to max_n_combinations

f95e7ef

add feature_samples to iter_list in setup for convenience

5c5f436

simplifying explain view + improve max_n_combinations sets and checks

21c43dc

man

2e7e864

Merge commit '2e7e86450686f61d6f4c7f63ac87c5857ff0094e' into convergence

699bde0

.

c9b679e

[skip actions] remaining stuff of max_n_combinations. Works, i think

95dda97

Fix tests later

[skip actions] remaining stuff of max_n_combinations. Works, i think

01b017e

Fix tests later

new bootstrap introduced with tests

d296e87

making tests work

6dbcaff

tests OK

303e323

LHBO and others added 4 commits October 10, 2024 19:54

Increased readability in approach vaeac

3fe8661

Delete old batch_prepare_vS_MC_auxiliary function and rename `batch…

236cecf

…_prepare_vS_MC_aux_causal` to `batch_prepare_vS_MC_auxiliary`.

[skip actions] fixing GHA notes

d385d99

Updated the format of the explain documentations, corrected some typo…

5204a15

…s, and rewritten for better clarity and less repetitions.

martinju requested changes Oct 10, 2024

View reviewed changes

R/shapley_setup.R Show resolved Hide resolved

R/shapley_setup.R Outdated Show resolved Hide resolved

inst/extdata/day.csv Outdated Show resolved Hide resolved

LHBO and others added 20 commits October 10, 2024 21:28

Refactored approach_gaussian.R for readability

9ce8106

Refactored approach_copula.R for readability

5c6b135

Removed todo note

972c060

Updated documentation. Some missing parameters

ab81de8

Removed check function not used

be26384

typo in explain

a9cef22

typo in explain

05d5ee8

update documentation

58195b9

Forgot to update batch_prepare_vS_MC_auxiliary function call

b14129d

n_MC_samples_updated -> n_MC_samples

ba7383a

Update docu about dt_valid_causal_coalitions

32b0e2e

documentation update

789927c

updates to asym vignette

110c5de

Logical error in approach gaussiand and copula after refactoring. Ver…

6ec1c8f

…ified that the tests that previously failed now runs.

vignette

4b4dad0

Merge remote-tracking branch 'LHBO/CausalShapleyNew' into CausalShapl…

503fb9b

…eyNew

.

0f9f668

fixed timing

8c33ac1

bugfix includegraphics

cbd2889

martinju approved these changes Oct 13, 2024

View reviewed changes

martinju merged commit 14128ac into NorskRegnesentral:shapr-1.0.0 Oct 13, 2024
4 of 5 checks passed

LHBO added a commit to LHBO/shapr that referenced this pull request Oct 13, 2024

Asymmetric causal Shapley values with adaptive sampling (NorskRegnese…

3ebb7ca

…ntral#400)

LHBO mentioned this pull request Oct 15, 2024

Causal and asymmetric Shapley values implementation #273

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Asymmetric causal Shapley values with adaptive sampling #400

Asymmetric causal Shapley values with adaptive sampling #400

LHBO commented Oct 4, 2024 •

edited

Loading

martinju left a comment

martinju left a comment

martinju commented Oct 13, 2024

Asymmetric causal Shapley values with adaptive sampling #400

Asymmetric causal Shapley values with adaptive sampling #400

Conversation

LHBO commented Oct 4, 2024 • edited Loading

martinju left a comment

Choose a reason for hiding this comment

martinju left a comment

Choose a reason for hiding this comment

martinju commented Oct 13, 2024

LHBO commented Oct 4, 2024 •

edited

Loading