Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Causal non param #622

Merged
merged 54 commits into from
Feb 25, 2024
Merged

Causal non param #622

merged 54 commits into from
Feb 25, 2024

Conversation

NathanielF
Copy link
Contributor

@NathanielF NathanielF commented Jan 6, 2024

Causal inference with propensity scores and Non-parametric methods

This is a WIP show-casing causal inference methods using non-parametric models and propensity score methods to highlight the use of BART and (if i can get it to work) Dirichlet Dependent regression in their usage in causal inference.

We will examine two data sets (1) the NHEFs data set discussed in the What-If causal inference book
and (2) and health expenditure data set discussed in the bayesian nonparametrics book for causal inference and missing data book.

This is related to the issue: #623 #623

Helpful links

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@drbenvincent
Copy link
Contributor

Feel free to tag me for review once this is ready.

@AlexAndorra
Copy link
Collaborator

Same about review if needed @NathanielF

Signed-off-by: Nathaniel <[email protected]>
@juanitorduz
Copy link
Collaborator

Me too 😁 ! (Also any help with BART)

@NathanielF NathanielF marked this pull request as ready for review January 16, 2024 23:48
@NathanielF
Copy link
Contributor Author

NathanielF commented Jan 16, 2024

Hi folks - @drbenvincent , @AlexAndorra , @juanitorduz

Tagging this one as ready for review.

The broad idea of the example was to show how to implement and assess propensity score weighting methods in a bayesian non-parametric approaches to causal inference using the flexibility of BART to estimate the propensity scores.

This turned out to be quite a rich vein. I cover robust and and doubly robust approaches to IPws and demonstrate their use on two data sets as "corrective" approaches. In (i) they work well and (ii) they work poorly.

In the case where they work poorly i show how the doubly robust estimator and double ML methods can be used to correct the bias induced by model mis-specification.

Then finally i contrast the propensity score methods in (ii) at getting at the ATE with a mediation style analysis (basically cribbed from Ben's example here) to show that the correct causal inference approach requires additional structural commitments about direction of effects.

I'm possibly doing too much here. But both data sets cover effects of smoking, difficuties of causal inference and have a natural trajectory where i try and lead the reader to the conclusion that naive Double ML approaches (even with the corrective moves) are not sufficient to model the complexity of real-world causal inference. The natural and correct perspective is obviously white-box bayesian causal models

Any and all feedback welcome.

Copy link

review-notebook-app bot commented Jan 17, 2024

View / edit / reply to this conversation on ReviewNB

juanitorduz commented on 2024-01-17T21:31:28Z
----------------------------------------------------------------

Can we draw the causal dag as an explicit way of stating our assumptions? We need to comment on the fact that we are not included colliders for example.


NathanielF commented on 2024-01-21T19:02:44Z
----------------------------------------------------------------

I added a DAG above but it's a DAG of the propensity score scenario. I wanted to keep the causal structure minimal in this example as it makes the ending more "dramatic" to show that we can't get away with just assuming strong ignorability but should be clearer about the DAG. I've fleshed out a paragraph discussing how propensity score methods are often seen as an approach to causal inference where we can get away with being non-commital about the DAG... I hope that makes sense?

AlexAndorra commented on 2024-01-28T19:18:39Z
----------------------------------------------------------------

I agree with Juan that the DAG would make this clearer, but now I understand why you didn't include it yet Nathaniel ;)

Copy link

review-notebook-app bot commented Jan 17, 2024

View / edit / reply to this conversation on ReviewNB

juanitorduz commented on 2024-01-17T21:31:29Z
----------------------------------------------------------------

Maybe comment why we use propensity scores if we could simply use a regressions model? For example, if we match on the whole set of covariates then the bias gets higher as the number of covariates increases because of the course of dimensionality.


Copy link

review-notebook-app bot commented Feb 15, 2024

View / edit / reply to this conversation on ReviewNB

aloctavodia commented on 2024-02-15T13:57:53Z
----------------------------------------------------------------

Y(0), and Y(1) are similarly conditionally independent of the treatement T|X

I find this sentence a little bit difficult to parse " This basically means that we require for each strata of the population defined by the covariate profile X that it's as good as random which treatment status an individual adopts after controlling for X. "

What about something like "This means that after controlling for X, any differences in outcomes between the treated and untreated groups can be attributed to the treatment itself rather than confounding variables."

This sentence provides the answer to the question in the title. It could be the first sentence in the section.

With observational data we cannot re-run the assignment mechanism but we can estimate it, and transform our data to proportionally weight the data summaries within each group so that the analysis is less effected by the over-representation of different strata in each group. This is what we hope to use the propensity scores to achieve.


NathanielF commented on 2024-02-16T19:02:38Z
----------------------------------------------------------------

This is really good. I've re-structured this paragraph with these comments in mind! Thanks

Copy link

review-notebook-app bot commented Feb 15, 2024

View / edit / reply to this conversation on ReviewNB

aloctavodia commented on 2024-02-15T13:57:54Z
----------------------------------------------------------------

Describe the pattern


NathanielF commented on 2024-02-16T20:09:47Z
----------------------------------------------------------------

Done

Copy link

review-notebook-app bot commented Feb 15, 2024

View / edit / reply to this conversation on ReviewNB

aloctavodia commented on 2024-02-15T13:57:54Z
----------------------------------------------------------------

A more up to date reference is https://arxiv.org/abs/2206.03619 and these examples https://www.pymc.io/projects/bart/en/latest/examples.html

Better move this to the beginning of the notebook. Whe propensity scores and introduced and motivated

The thought is that any given strata in our dataset will be described by a set of covariates. Types of individual will be represented by these covariate profiles - the attribute vector X

. The share of observations within our data which are picked out by any given covariate profile represents a bias towards that type of individual. If our treatment status is such that individuals will more or less actively select themselves into the status, then a naive comparisons of differences between treatment groups and control groups will be misleading to the degree that we have over-represented types of individual (covariate profiles) in the population.

Randomisation solves this by balancing the covariate profiles across treatment and control groups and ensuring the outcomes are independent of the treatment assignment. But we can't always randomise. Propensity scores are useful because they can help emulate as-if random assignment of treatment status in the sample data through a specific transformation of the observed data.


NathanielF commented on 2024-02-16T20:10:06Z
----------------------------------------------------------------

Reworked this too

Copy link
Contributor Author

added another plot


View entire conversation on ReviewNB

Copy link
Contributor Author

This is really good. I've re-structured this paragraph with these comments in mind! Thanks


View entire conversation on ReviewNB

Copy link
Contributor Author

Done


View entire conversation on ReviewNB

Copy link
Contributor Author

Reworked this too


View entire conversation on ReviewNB

@NathanielF
Copy link
Contributor Author

Thanks for approving @aloctavodia !

Signed-off-by: Nathaniel <[email protected]>
Copy link

review-notebook-app bot commented Feb 24, 2024

View / edit / reply to this conversation on ReviewNB

AlexAndorra commented on 2024-02-24T19:04:52Z
----------------------------------------------------------------

  • But if we don't have good balance we can use propensity scores
  • General comment: sounds a lot like reweighting strategies of pollsters

NathanielF commented on 2024-02-24T19:47:35Z
----------------------------------------------------------------

Clarified this a bit

The balancing tests we've just seen are to show balance of covariate conditional on the propensity scores. With good demonstrated balance i.e mean differences close to zero, suggests that strong ignorability holds and then we can plausibly use the propensity scores

:

In an ideal world we would have perfect balance across the treatment groups for each of the covariates, but even approximate balance as we see here is useful. When we have good covariate balance (conditional on the propensity scores) we can then use propensity scores in weighting schemes with models of statistical summaries so as to "correct" the representation of covariate profiles across both groups.

NathanielF commented on 2024-02-24T19:49:01Z
----------------------------------------------------------------

It is very close in approach to the re-weighting strategies use by pollsters! I started looking into the strata re-weighting stuff after looking into MrP stuff here:https://bambinos.github.io/bambi/notebooks/mister_p.html

Copy link

review-notebook-app bot commented Feb 24, 2024

View / edit / reply to this conversation on ReviewNB

AlexAndorra commented on 2024-02-24T19:04:53Z
----------------------------------------------------------------

I don't understand this sentence. I there a typo?


NathanielF commented on 2024-02-24T19:50:08Z
----------------------------------------------------------------

Yeah, that was a mess.

Rewrote:

It's difficult to see a clear pattern in this visual. In both treatment groups, when there is some significant sample size, we see a mean difference close to zero for both groups.

Copy link
Collaborator

@AlexAndorra AlexAndorra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just added two comments, but looks good now, thanks a lot @NathanielF !

Copy link
Contributor Author

Clarified this a bit

The balancing tests we've just seen are to show balance of covariate conditional on the propensity scores. With good demonstrated balance i.e mean differences close to zero, suggests that strong ignorability holds and then we can plausibly use the propensity scores

:

In an ideal world we would have perfect balance across the treatment groups for each of the covariates, but even approximate balance as we see here is useful. When we have good covariate balance (conditional on the propensity scores) we can then use propensity scores in weighting schemes with models of statistical summaries so as to "correct" the representation of covariate profiles across both groups.

View entire conversation on ReviewNB

Copy link
Contributor Author

It is very close in approach to the re-weighting strategies use by pollsters! I started looking into the strata re-weighting stuff after looking into MrP stuff here:https://bambinos.github.io/bambi/notebooks/mister_p.html


View entire conversation on ReviewNB

Copy link
Contributor Author

Yeah, that was a mess.

Rewrote:

It's difficult to see a clear pattern in this visual. In both treatment groups, when there is some significant sample size, we see a mean difference close to zero for both groups.

View entire conversation on ReviewNB

Signed-off-by: Nathaniel <[email protected]>
@NathanielF
Copy link
Contributor Author

Addressed those comments @AlexAndorra . I'm happy if you're happy. Feel free to merge!

Thanks again @AlexAndorra , @drbenvincent , @juanitorduz and @aloctavodia . The piece is much stronger for all your feedback!

@AlexAndorra
Copy link
Collaborator

AlexAndorra commented Feb 25, 2024

Awesome @NathanielF ! Can we merge without the pre-commit build passing though?

@NathanielF
Copy link
Contributor Author

I think so. That check was never used before and @maresb opened an issue about it here: #638

@maresb
Copy link
Contributor

maresb commented Feb 25, 2024

Ya, that one pre-commit check is passing, and that's what counts.

We just need someone with admin to disable the broken failing check as per #638.

@AlexAndorra
Copy link
Collaborator

Ok, merging then 🍾

@AlexAndorra AlexAndorra merged commit 4c63cd3 into pymc-devs:main Feb 25, 2024
2 of 3 checks passed
@NathanielF
Copy link
Contributor Author

Thanks so much @AlexAndorra !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants