Investigating the interplay between causal estimators, ML base learners, hyperparameters and model evaluation metrics.
This code accompanies the paper:
D. Machlanski, S. Samothrakis, and P. Clarke, ‘Hyperparameter Tuning and Model Evaluation in Causal Effect Estimation’. arXiv, Mar. 02, 2023. doi: 10.48550/arXiv.2303.01412. link.
All datasets and results (in progress) are available here.
Follow the steps below.
- Download datasets from here and put them under 'datasets' folder.
- Prepare Python environment.
- Install miniconda.
- If you intend to run Neural Networks, run
conda env create -f environment_tf.yml
. - Otherwise, you can use the default environment
conda env create -f environment.yml
.
- Go to 'scripts' folder and run
bash paper.sh
. This will run ALL the experiments. - Go to 'analysis' folder.
- If you want the results in the form of latex tables:
- Go to
utils.py
and setRESULTS = 'latex'
. - Run
python compare_save_latex.py
. - Now you can use:
metrics_meta_latex.ipynb
,correlations_meta_latex.ipynb
andtest_correlations.ipynb
.
- Go to
- If you want the results visualised with plots:
- Use:
plot_estimators.ipynb
,plot_hyperparams.ipynb
. - In order to use
plot_metrics.ipynb
, you have to perform some extra steps. - Go to
utils.py
and setRESULTS = 'mean'
. - Run
python compare_save_mean.py
. - Now you can use
plot_metrics.ipynb
.
- Use:
Note that running all experiments (step 3.) may take a LONG time (weeks, likely months). Highly parallelised computing environments are recommended.
It is possible to skip step 3. by downloading our results from here.
It is also possible to skip scripts compare_save_xxx.py
as the most important CSV files obtained as part of the paper are included in this repository.
The following description explains only the most important files and directories necessary to replicate the paper.
├── environment.yml <- Replicate the environment to run all the scripts.
├── environment_tf.yml <- As above but with Tensorflow (required to run neural networks).
│
├── analysis
│ ├── compare_save_xxx.py <- Post-processes 'results' into CSV files.
│ ├── tables <- CSVs from above are stored here.
│ ├── utils.py <- Important functions used by `compare_save.py'.
│ ├── plot_estimators.ipynb <- Visualise performance of CATE estimators.
│ ├── plot_hyperparams.ipynb <- Visualise performance against types of hyperparameters.
│ ├── plot_metrics.ipynb <- Visualise performance of metrics.
│ ├── test_correlations.ipynb <- Compute correlations between test metrics (e.g., ATE and PEHE).
│ ├── correlations_meta_latex.ipynb <- Compute correlations between validation and test metrics (e.g, MSE and PEHE).
│ └── metrics_meta_latex.ipynb <- Compute all metrics (latex format).
│
├── datasets <- All four datasets go here (IHDP, Jobs, Twins and News).
│
├── helpers <- General helper functions.
│
├── models
│ ├── data <- Models for datasets.
│ ├── estimators <- Implementations of CATE estimators.
│ ├── estimators_tf <- Code for Neural Networks (Tensorflow).
│ └── scorers <- Implementations of learning-based metrics.
│
├── results
│ ├── metrics <- Conventional, non-learning metrics (MSE, R^2, PEHE, etc.).
│ ├── predictions <- Predicted outcomes and CATEs.
│ ├── scorers <- Predictions of scorers (plugin, matching and rscore).
│ └── scores <- Actual scores (combines 'predictions' and 'scorers').
│
└── scripts
└── paper.sh <- Replicate all experiments from the paper.