Causal non param #622

NathanielF · 2024-01-06T21:44:08Z

Causal inference with propensity scores and Non-parametric methods

This is a WIP show-casing causal inference methods using non-parametric models and propensity score methods to highlight the use of BART and (if i can get it to work) Dirichlet Dependent regression in their usage in causal inference.

We will examine two data sets (1) the NHEFs data set discussed in the What-If causal inference book
and (2) and health expenditure data set discussed in the bayesian nonparametrics book for causal inference and missing data book.

This is related to the issue: #623 #623

Notebook follows style guide https://docs.pymc.io/en/latest/contributing/jupyter_style.html
PR description contains a link to the relevant issue:
- a tracker one for existing notebooks (tracker issues have the "tracker id" label)
- or a proposal one for new notebooks
Check the notebook is not excluded from any pre-commit check: https://github.com/pymc-devs/pymc-examples/blob/main/.pre-commit-config.yaml

Helpful links

https://github.com/pymc-devs/pymc-examples/blob/main/CONTRIBUTING.md

Signed-off-by: Nathaniel <[email protected]>

review-notebook-app · 2024-01-06T21:44:13Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

drbenvincent · 2024-01-07T10:17:02Z

Feel free to tag me for review once this is ready.

Signed-off-by: Nathaniel <[email protected]>

AlexAndorra · 2024-01-08T19:21:18Z

Same about review if needed @NathanielF

Signed-off-by: Nathaniel <[email protected]>

juanitorduz · 2024-01-08T21:47:24Z

Me too 😁 ! (Also any help with BART)

Signed-off-by: Nathaniel <[email protected]>

NathanielF · 2024-01-16T23:57:46Z

Hi folks - @drbenvincent , @AlexAndorra , @juanitorduz

Tagging this one as ready for review.

The broad idea of the example was to show how to implement and assess propensity score weighting methods in a bayesian non-parametric approaches to causal inference using the flexibility of BART to estimate the propensity scores.

This turned out to be quite a rich vein. I cover robust and and doubly robust approaches to IPws and demonstrate their use on two data sets as "corrective" approaches. In (i) they work well and (ii) they work poorly.

In the case where they work poorly i show how the doubly robust estimator and double ML methods can be used to correct the bias induced by model mis-specification.

Then finally i contrast the propensity score methods in (ii) at getting at the ATE with a mediation style analysis (basically cribbed from Ben's example here) to show that the correct causal inference approach requires additional structural commitments about direction of effects.

I'm possibly doing too much here. But both data sets cover effects of smoking, difficuties of causal inference and have a natural trajectory where i try and lead the reader to the conclusion that naive Double ML approaches (even with the corrective moves) are not sufficient to model the complexity of real-world causal inference. The natural and correct perspective is obviously white-box bayesian causal models

Any and all feedback welcome.

review-notebook-app · 2024-01-17T21:31:28Z

View / edit / reply to this conversation on ReviewNB

juanitorduz commented on 2024-01-17T21:31:28Z
----------------------------------------------------------------

Can we draw the causal dag as an explicit way of stating our assumptions? We need to comment on the fact that we are not included colliders for example.

NathanielF commented on 2024-01-21T19:02:44Z
----------------------------------------------------------------

I added a DAG above but it's a DAG of the propensity score scenario. I wanted to keep the causal structure minimal in this example as it makes the ending more "dramatic" to show that we can't get away with just assuming strong ignorability but should be clearer about the DAG. I've fleshed out a paragraph discussing how propensity score methods are often seen as an approach to causal inference where we can get away with being non-commital about the DAG... I hope that makes sense?

AlexAndorra commented on 2024-01-28T19:18:39Z
----------------------------------------------------------------

I agree with Juan that the DAG would make this clearer, but now I understand why you didn't include it yet Nathaniel ;)

review-notebook-app · 2024-01-17T21:31:29Z

View / edit / reply to this conversation on ReviewNB

juanitorduz commented on 2024-01-17T21:31:29Z
----------------------------------------------------------------

Maybe comment why we use propensity scores if we could simply use a regressions model? For example, if we match on the whole set of covariates then the bias gets higher as the number of covariates increases because of the course of dimensionality.

review-notebook-app · 2024-02-15T13:57:53Z

View / edit / reply to this conversation on ReviewNB

aloctavodia commented on 2024-02-15T13:57:53Z
----------------------------------------------------------------

Y(0), and Y(1) are similarly conditionally independent of the treatement T|X

I find this sentence a little bit difficult to parse " This basically means that we require for each strata of the population defined by the covariate profile X that it's as good as random which treatment status an individual adopts after controlling for X. "

What about something like "This means that after controlling for X, any differences in outcomes between the treated and untreated groups can be attributed to the treatment itself rather than confounding variables."

This sentence provides the answer to the question in the title. It could be the first sentence in the section.

With observational data we cannot re-run the assignment mechanism but we can estimate it, and transform our data to proportionally weight the data summaries within each group so that the analysis is less effected by the over-representation of different strata in each group. This is what we hope to use the propensity scores to achieve.

NathanielF commented on 2024-02-16T19:02:38Z
----------------------------------------------------------------

This is really good. I've re-structured this paragraph with these comments in mind! Thanks

review-notebook-app · 2024-02-15T13:57:54Z

View / edit / reply to this conversation on ReviewNB

aloctavodia commented on 2024-02-15T13:57:54Z
----------------------------------------------------------------

Describe the pattern

NathanielF commented on 2024-02-16T20:09:47Z
----------------------------------------------------------------

Done

review-notebook-app · 2024-02-15T13:57:55Z

View / edit / reply to this conversation on ReviewNB

aloctavodia commented on 2024-02-15T13:57:54Z
----------------------------------------------------------------

A more up to date reference is https://arxiv.org/abs/2206.03619 and these examples https://www.pymc.io/projects/bart/en/latest/examples.html

Better move this to the beginning of the notebook. Whe propensity scores and introduced and motivated

The thought is that any given strata in our dataset will be described by a set of covariates. Types of individual will be represented by these covariate profiles - the attribute vector X

. The share of observations within our data which are picked out by any given covariate profile represents a bias towards that type of individual. If our treatment status is such that individuals will more or less actively select themselves into the status, then a naive comparisons of differences between treatment groups and control groups will be misleading to the degree that we have over-represented types of individual (covariate profiles) in the population.

Randomisation solves this by balancing the covariate profiles across treatment and control groups and ensuring the outcomes are independent of the treatment assignment. But we can't always randomise. Propensity scores are useful because they can help emulate as-if random assignment of treatment status in the sample data through a specific transformation of the observed data.

NathanielF commented on 2024-02-16T20:10:06Z
----------------------------------------------------------------

Reworked this too

NathanielF · 2024-02-16T18:49:16Z

added another plot

View entire conversation on ReviewNB

NathanielF · 2024-02-16T19:02:40Z

This is really good. I've re-structured this paragraph with these comments in mind! Thanks

View entire conversation on ReviewNB

NathanielF · 2024-02-16T20:09:48Z

Done

View entire conversation on ReviewNB

NathanielF · 2024-02-16T20:10:08Z

Reworked this too

View entire conversation on ReviewNB

Signed-off-by: Nathaniel <[email protected]>

NathanielF · 2024-02-20T20:51:49Z

Thanks for approving @aloctavodia !

Signed-off-by: Nathaniel <[email protected]>

review-notebook-app · 2024-02-24T19:04:53Z

View / edit / reply to this conversation on ReviewNB

AlexAndorra commented on 2024-02-24T19:04:52Z
----------------------------------------------------------------

But if we don't have good balance we can use propensity scores
General comment: sounds a lot like reweighting strategies of pollsters

NathanielF commented on 2024-02-24T19:47:35Z
----------------------------------------------------------------

Clarified this a bit

The balancing tests we've just seen are to show balance of covariate conditional on the propensity scores. With good demonstrated balance i.e mean differences close to zero, suggests that strong ignorability holds and then we can plausibly use the propensity scores

:

In an ideal world we would have perfect balance across the treatment groups for each of the covariates, but even approximate balance as we see here is useful. When we have good covariate balance (conditional on the propensity scores) we can then use propensity scores in weighting schemes with models of statistical summaries so as to "correct" the representation of covariate profiles across both groups.

NathanielF commented on 2024-02-24T19:49:01Z
----------------------------------------------------------------

It is very close in approach to the re-weighting strategies use by pollsters! I started looking into the strata re-weighting stuff after looking into MrP stuff here:https://bambinos.github.io/bambi/notebooks/mister_p.html

review-notebook-app · 2024-02-24T19:04:53Z

View / edit / reply to this conversation on ReviewNB

AlexAndorra commented on 2024-02-24T19:04:53Z
----------------------------------------------------------------

I don't understand this sentence. I there a typo?

NathanielF commented on 2024-02-24T19:50:08Z
----------------------------------------------------------------

Yeah, that was a mess.

Rewrote:

It's difficult to see a clear pattern in this visual. In both treatment groups, when there is some significant sample size, we see a mean difference close to zero for both groups.

AlexAndorra

Just added two comments, but looks good now, thanks a lot @NathanielF !

Signed-off-by: Nathaniel <[email protected]>

NathanielF · 2024-02-24T19:47:36Z

Clarified this a bit

The balancing tests we've just seen are to show balance of covariate conditional on the propensity scores. With good demonstrated balance i.e mean differences close to zero, suggests that strong ignorability holds and then we can plausibly use the propensity scores

:

In an ideal world we would have perfect balance across the treatment groups for each of the covariates, but even approximate balance as we see here is useful. When we have good covariate balance (conditional on the propensity scores) we can then use propensity scores in weighting schemes with models of statistical summaries so as to "correct" the representation of covariate profiles across both groups.

View entire conversation on ReviewNB

NathanielF · 2024-02-24T19:49:03Z

It is very close in approach to the re-weighting strategies use by pollsters! I started looking into the strata re-weighting stuff after looking into MrP stuff here:https://bambinos.github.io/bambi/notebooks/mister_p.html

View entire conversation on ReviewNB

NathanielF · 2024-02-24T19:50:09Z

Yeah, that was a mess.

Rewrote:

It's difficult to see a clear pattern in this visual. In both treatment groups, when there is some significant sample size, we see a mean difference close to zero for both groups.

View entire conversation on ReviewNB

Signed-off-by: Nathaniel <[email protected]>

NathanielF · 2024-02-24T20:04:02Z

Addressed those comments @AlexAndorra . I'm happy if you're happy. Feel free to merge!

Thanks again @AlexAndorra , @drbenvincent , @juanitorduz and @aloctavodia . The piece is much stronger for all your feedback!

AlexAndorra · 2024-02-25T19:22:52Z

Awesome @NathanielF ! Can we merge without the pre-commit build passing though?

NathanielF · 2024-02-25T19:27:58Z

I think so. That check was never used before and @maresb opened an issue about it here: #638

maresb · 2024-02-25T20:22:54Z

Ya, that one pre-commit check is passing, and that's what counts.

We just need someone with admin to disable the broken failing check as per #638.

AlexAndorra · 2024-02-25T20:48:31Z

Ok, merging then 🍾

NathanielF · 2024-02-25T20:49:36Z

Thanks so much @AlexAndorra !

NathanielF added 5 commits January 3, 2024 22:13

Adding propensity score weighting

0871356

Signed-off-by: Nathaniel <[email protected]>

updating propensity score model 2 data sets

c62e9a3

Signed-off-by: Nathaniel <[email protected]>

added more detail re uncertainty

d04bfd7

Signed-off-by: Nathaniel <[email protected]>

added heterogenous effects plot

567e937

Signed-off-by: Nathaniel <[email protected]>

adding myst

cd67e76

Signed-off-by: Nathaniel <[email protected]>

adding robust propensity scores

3d3688b

Signed-off-by: Nathaniel <[email protected]>

added some intro text

2ef0d3e

Signed-off-by: Nathaniel <[email protected]>

NathanielF added 12 commits January 9, 2024 22:04

adding more write up

b3f3035

Signed-off-by: Nathaniel <[email protected]>

adding more write up.

e2a4893

Signed-off-by: Nathaniel <[email protected]>

adding write up on double robust estimators

b214f4d

Signed-off-by: Nathaniel <[email protected]>

angrist and pischke quote and dr math

d82c2e4

Signed-off-by: Nathaniel <[email protected]>

removing quantile prediction adding Double ML

a186255

Signed-off-by: Nathaniel <[email protected]>

adding references

fd993b8

Signed-off-by: Nathaniel <[email protected]>

fixing bibfile

0f29f9a

Signed-off-by: Nathaniel <[email protected]>

added fwl discussion

f1901f5

Signed-off-by: Nathaniel <[email protected]>

hiding input functions

2483cad

Signed-off-by: Nathaniel <[email protected]>

hiding cells and run checks

e35b0fd

Signed-off-by: Nathaniel <[email protected]>

full run with conclusion

dcb48ce

Signed-off-by: Nathaniel <[email protected]>

adding conclusion and mediation section

924f6c4

Signed-off-by: Nathaniel <[email protected]>

NathanielF marked this pull request as ready for review January 16, 2024 23:48

NathanielF requested review from drbenvincent, AlexAndorra and juanitorduz January 16, 2024 23:57

addressing latest comments improved structure

cf83de4

NathanielF added 4 commits February 16, 2024 22:11

fix typos and add raw propensity scores example

d716a65

clearer intro

107659c

updating myst

3eab124

Signed-off-by: Nathaniel <[email protected]>

some tidying

9c7dec5

Signed-off-by: Nathaniel <[email protected]>

aloctavodia approved these changes Feb 20, 2024

View reviewed changes

better balance discussion

313ca76

Signed-off-by: Nathaniel <[email protected]>

AlexAndorra approved these changes Feb 24, 2024

View reviewed changes

address Alex's comments and change BART likelihood in Double ML example

6cea5cd

Signed-off-by: Nathaniel <[email protected]>

fixing pre-commit issue

7f18473

Signed-off-by: Nathaniel <[email protected]>

AlexAndorra merged commit 4c63cd3 into pymc-devs:main Feb 25, 2024
2 of 3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Causal non param #622

Causal non param #622

NathanielF commented Jan 6, 2024 •

edited

Loading

review-notebook-app bot commented Jan 6, 2024

drbenvincent commented Jan 7, 2024

AlexAndorra commented Jan 8, 2024

juanitorduz commented Jan 8, 2024

NathanielF commented Jan 16, 2024 •

edited

Loading

review-notebook-app bot commented Jan 17, 2024 •

edited

Loading

review-notebook-app bot commented Jan 17, 2024 •

edited

Loading

review-notebook-app bot commented Feb 15, 2024 •

edited

Loading

review-notebook-app bot commented Feb 15, 2024 •

edited

Loading

review-notebook-app bot commented Feb 15, 2024 •

edited

Loading

NathanielF commented Feb 16, 2024

NathanielF commented Feb 16, 2024

NathanielF commented Feb 16, 2024

NathanielF commented Feb 16, 2024

NathanielF commented Feb 20, 2024

review-notebook-app bot commented Feb 24, 2024 •

edited

Loading

review-notebook-app bot commented Feb 24, 2024 •

edited

Loading

AlexAndorra left a comment

NathanielF commented Feb 24, 2024

NathanielF commented Feb 24, 2024

NathanielF commented Feb 24, 2024

NathanielF commented Feb 24, 2024

AlexAndorra commented Feb 25, 2024 •

edited

Loading

NathanielF commented Feb 25, 2024

maresb commented Feb 25, 2024

AlexAndorra commented Feb 25, 2024

NathanielF commented Feb 25, 2024

Causal non param #622

Causal non param #622

Conversation

NathanielF commented Jan 6, 2024 • edited Loading

Causal inference with propensity scores and Non-parametric methods

Helpful links

review-notebook-app bot commented Jan 6, 2024

drbenvincent commented Jan 7, 2024

AlexAndorra commented Jan 8, 2024

juanitorduz commented Jan 8, 2024

NathanielF commented Jan 16, 2024 • edited Loading

review-notebook-app bot commented Jan 17, 2024 • edited Loading

review-notebook-app bot commented Jan 17, 2024 • edited Loading

review-notebook-app bot commented Feb 15, 2024 • edited Loading

review-notebook-app bot commented Feb 15, 2024 • edited Loading

review-notebook-app bot commented Feb 15, 2024 • edited Loading

NathanielF commented Feb 16, 2024

NathanielF commented Feb 16, 2024

NathanielF commented Feb 16, 2024

NathanielF commented Feb 16, 2024

NathanielF commented Feb 20, 2024

review-notebook-app bot commented Feb 24, 2024 • edited Loading

review-notebook-app bot commented Feb 24, 2024 • edited Loading

AlexAndorra left a comment

Choose a reason for hiding this comment

NathanielF commented Feb 24, 2024

NathanielF commented Feb 24, 2024

NathanielF commented Feb 24, 2024

NathanielF commented Feb 24, 2024

AlexAndorra commented Feb 25, 2024 • edited Loading

NathanielF commented Feb 25, 2024

maresb commented Feb 25, 2024

AlexAndorra commented Feb 25, 2024

NathanielF commented Feb 25, 2024

NathanielF commented Jan 6, 2024 •

edited

Loading

NathanielF commented Jan 16, 2024 •

edited

Loading

review-notebook-app bot commented Jan 17, 2024 •

edited

Loading

review-notebook-app bot commented Jan 17, 2024 •

edited

Loading

review-notebook-app bot commented Feb 15, 2024 •

edited

Loading

review-notebook-app bot commented Feb 15, 2024 •

edited

Loading

review-notebook-app bot commented Feb 15, 2024 •

edited

Loading

review-notebook-app bot commented Feb 24, 2024 •

edited

Loading

review-notebook-app bot commented Feb 24, 2024 •

edited

Loading

AlexAndorra commented Feb 25, 2024 •

edited

Loading