Skip to content

Latest commit

 

History

History
77 lines (51 loc) · 4.46 KB

README.md

File metadata and controls

77 lines (51 loc) · 4.46 KB

Scalable Geometrical Generative Models | SGGM

Paper

CI testing

An honest attempt at making uncertainty prediction with neural networks reliable and principled.

Description

Repository Organisation

The repository is organized as a package, the code being located under sggm/.

The package information is described in the setup.py file. The overall settings are regrouped in the setup.cfg file, which mainly contains the testing (pytest) and automatic linting configurations (flake8). The dependencies/ folder holds the built geoml package, as it is currently not publicly released on a package management platform such as pip.

A simple test, tests/test_toy.py, ran after each commit on master through GitHub actions provides a sanity check on the code. It mainly flags high-level issues, such as dependency issues.

The files run.sh and verify_run.py are used for executing training jobs on a cluster, specifically a cluster using the bsub queue system, such as DTU's HPC (link #1, link #2), and their usage will be detailed later.

The Code

The project relies on Pytorch-Lightning for handling all the nitty picky engineering details. The execution logic is located in the experiment.py file while the modelling logic is located in the respective *_model.py file. The experiment file can be fed parameters either directly through the CLI or through a config file (see configs/*.yml for inspiration) combining the custom arguments defined in experiment.py or specific to the model, as defined in definitions.py, with those predefined for the Pytorch-Lightning Trainer.

Using a config file for specifying the run parameters makes it very straightforward to use:

python experiment.py --experiments_config configs/config_file.yml

And is the recommended way to proceed.

Running on a Cluster

Once you're happy with your config file, running a complete experiment accelerated on gpu is completely automatic. If you have access to a computing cluster that uses the bsub queue system, simply specify the correct config file in the run.sh executable and submit it. The verify_run.py simply allows to verify the experiment name and run names specified in a given config file. It is recommended to use it before submitting jobs to verify that the config file provided is valid.

Analysis

The analysis/ folder holds a variety of analysis scripts, which are supposed to be freely updated for the given task at hand.

  • analysis/run.py: main entry point to extract the analysis metrics and generate the plots specific to each dataset.
  • analysis/run_ood.py: Run analysis plots on inputs coming from a different dataset than the one used for training.
  • analysis/run_uci.py: Shortcut to run the analysis of all uci experiments at once.
  • analysis/run_clustering.py: Run clustering on the latent encodings of the test dataset in a VAE setting.
  • analysis/compare.py: Generates a comparison csv for the analysis metrics for several experiments at once.

Refitting the Encoder

run_name becomes run_name_refit_encoder_$other_experiment_name

Citation

The Paper is currently WIP - Full reference to be provided soon!