Corrected cross-fitting loop #42

sami6mz · 2023-08-29T15:32:39Z

Previous cross-fitting loop was using an array of numpy array, and we suspected it to slow down the computation time since multiply_robust_efficient is faster than med_dml. This commit optimizes the loop.

In the end it appears the loop wasn't slowing down anything (med_dml is slower because it calls forest instances more often than multiply_robust).
Also the new loop seems to give slightly more accurate estimation.

bthirion

LGTM overall, thx !

src/benchmark_mediation.py

bthirion · 2023-08-29T19:16:01Z

src/benchmark_mediation.py

+        sumscore3 = np.mean(t * (1 - ptmx) / (ptmx * (1 - ptx)))
+        sumscore4 = np.mean((1 - t) * ptmx / ((1 - ptmx) * ptx))
+        y1m1 = (t / ptx * (y - mu_t1_x)) / sumscore1 + mu_t1_x
+        y0m0 = ((1 - t) / (1 - ptx) * (y - mu_t0_x)) / sumscore2 + mu_t0_x
        y1m0 = (


Are there tests that check whether this snippet computes the right thing ?
Such computation blocks should better be isolated into small functions bzw.

There aren't any test, I just made sure it gives similar performances with and without normalization on several dataset. I can declared that as an issue, #44

If we want this lib to survive, we need to add a test suite in highest priority.
See #45

bthirion

lgtm

bthirion

Sorry, let me get back to testing. There should be a unit test that checks that when normalized=True, the numerical result is fine.

judithabk6 · 2023-11-22T10:29:29Z

ok. So to recap

the first commit corrects the way the cross-fitting is done (just rewriting, it yields the same results - should be explicit with a test, obviously)
the second commit is just documentation

As for #43 only the code in get_estimation is tested, so the package is not properly equipped to check specifically that this PR does not break the code.

Should we merge this as it is, and implement a more specific test suite? @bthirion @houssamzenati

corrected cross-fitting loop

3de1915

bthirion reviewed Aug 29, 2023

View reviewed changes

sami6mz mentioned this pull request Aug 30, 2023

Check the proper behavior of probability weighting normalization #44

Open

normalization documented

0df129d

bthirion approved these changes Aug 31, 2023

View reviewed changes

bthirion reviewed Aug 31, 2023

View reviewed changes

judithabk6 merged commit 5211eae into judithabk6:main Dec 4, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Corrected cross-fitting loop #42

Corrected cross-fitting loop #42

sami6mz commented Aug 29, 2023

bthirion left a comment

bthirion Aug 29, 2023

sami6mz Aug 30, 2023 •

edited

Loading

bthirion Aug 30, 2023

bthirion left a comment

bthirion left a comment

judithabk6 commented Nov 22, 2023

Corrected cross-fitting loop #42

Corrected cross-fitting loop #42

Conversation

sami6mz commented Aug 29, 2023

bthirion left a comment

Choose a reason for hiding this comment

bthirion Aug 29, 2023

Choose a reason for hiding this comment

sami6mz Aug 30, 2023 • edited Loading

Choose a reason for hiding this comment

bthirion Aug 30, 2023

Choose a reason for hiding this comment

bthirion left a comment

Choose a reason for hiding this comment

bthirion left a comment

Choose a reason for hiding this comment

judithabk6 commented Nov 22, 2023

sami6mz Aug 30, 2023 •

edited

Loading