Version release v0.1.2

automl · Jul 11, 2024 · bfc457a · bfc457a
2 parents be78cbe + 90bc029
commit bfc457a
Show file tree

Hide file tree

Showing 176 changed files with 2,362 additions and 528 deletions.
diff --git a/.github/PULL_REQUEST_TEMPLATE.md b/.github/PULL_REQUEST_TEMPLATE.md
@@ -0,0 +1,23 @@
+## Pull Request Checklist
+
+Thank you for your contribution! Before submitting this PR, please make sure you have completed the following steps:
+
+### 1. Unit Tests / Normal PR Workflow
+
+- [ ] Ensure all existing unit tests pass.
+- [ ] Add new unit tests to cover the changes.
+- [ ] Verify that your code follows the project's coding standards.
+- [ ] Add documentation for your code if necessary.
+- [ ] Check below, if your changes require you to run benchmarks.
+
+#### When Do I Need To Run Benchmarks?
+
+Depending on your changes, we ask you to run some benchmarks:
+
+1. Style changes.
+
+    If your changes only consist of style modifications, such as renaming or adding docstrings, and do not interfere with DEHB's interface, functionality, or algorithm, it is sufficient for all test cases to pass.
+
+2. Changes to DEHB's interface and functionality or the algorithm itself.
+
+    If your changes affect the interface, functionality, or algorithm of DEHB, please also run the synthetic benchmarks (MFH3, MFH6 of MFPBench, and the CountingOnes benchmark). This will help determine whether any changes introduced bugs or significantly altered DEHB's performance. However, at the reviewer's discretion, you may also be asked to run your changes on real-world benchmarks if deemed necessary. For instructions on how to install and run the benchmarks, please have a look at our [benchmarking instructions](../benchmarking/BENCHMARKING.md). Please use the same budget for your benchmark runs as we specified in the instructions.
diff --git a/.github/workflows/citation_cff.yml b/.github/workflows/citation_cff.yml
diff --git a/.gitignore b/.gitignore
@@ -14,15 +14,17 @@ __pycache__/
 */*/__pycache__/
 
 
-# folders as artefacts
+# folders as artifacts
 .idea/
-results/
 plots/
 workflow/
 dask-worker-space/
 dehb/examples/*/results
 dehb/examples/*/*/results
 .ipynb_checkpoints/
+data/
+logs/
+*.err
 
 
 # automl_template .gitignore

diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -5,6 +5,18 @@ All notable changes to this project will be documented in this file.
 The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/)
 and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
 
+## [0.1.2] - 2024-07-10
+
+### Added
+- Improved logging by making log level parameterizable (#85)
+- Improved respecting of runtime budget (#30)
+- Improved seeding/rng generation + seeding config space (#83)
+- Add warning messages when using deprecated `run` parameters
+- Add benchmarking suite + instructions
+
+### Changes
+- Add requirement for numpy<2.0 as ConfigSpace does not support numpy 2.0 yet
+
 ## [0.1.1] - 2024-04-01
 
 ### Added
@@ -77,7 +89,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 ### Added
 - Initial project release and push to PyPI
 
-[unreleased]: https://github.com/automl/DEHB/compare/v0.1.1...master
+[unreleased]: https://github.com/automl/DEHB/compare/v0.1.2...master
+[0.1.2]: https://github.com/automl/DEHB/compare/v0.1.1...v0.1.2
 [0.1.1]: https://github.com/automl/DEHB/compare/v0.1.0...v0.1.1
 [0.1.0]: https://github.com/automl/DEHB/compare/v0.0.7...v0.1.0
 [0.0.7]: https://github.com/automl/DEHB/compare/v0.0.6...v0.0.7

diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -10,6 +10,7 @@ Thank you for considering contributing to DEHB! We welcome contributions from th
 - [Code Contributions](#code-contributions)
 - [Submitting a Pull Request](#submitting-a-pull-request)
 - [Code Style and Guidelines](#code-style-and-guidelines)
+- [Documentation](#documentation)
 - [Community Guidelines](#community-guidelines)
 
 ## How to Contribute
@@ -66,6 +67,16 @@ When submitting a pull request, please ensure the following:
 - Ensure your code follows the project's code style and guidelines.
 - Be responsive to any feedback or questions during the review process.
 
+Additonally, we ask you to run specific benchmarks, depending on the depth of your changes:
+
+1. Style changes.
+
+    If your changes only consist of style modifications, such as renaming or adding docstrings, and do not interfere with DEHB's interface, functionality, or algorithm, it is sufficient for all test cases to pass.
+
+2. Changes to DEHB's interface and functionality or the algorithm itself.
+
+    If your changes affect the interface, functionality, or algorithm of DEHB, please also run the synthetic benchmarks (MFH3, MFH6 of MFPBench, and the CountingOnes benchmark). This will help determine whether any changes introduced bugs or significantly altered DEHB's performance. However, at the reviewer's discretion, you may also be asked to run your changes on real-world benchmarks if deemed necessary. For instructions on how to install and run the benchmarks, please have a look at our [benchmarking instructions](./benchmarking/BENCHMARKING.md). Please use the same budget for your benchmark runs as we specified in the instructions.
+
 ## Code Style and Guidelines
 
 To maintain consistency and readability, we follow a set of code style and guidelines. Please make sure that your code adheres to these standards:
@@ -78,6 +89,66 @@ To maintain consistency and readability, we follow a set of code style and guide
 - Write comprehensive and meaningful commit messages.
 - Write unit tests for new features and ensure existing tests pass.
 
+## Documentation
+Proper documentation is crucial for the maintainability and usability of the DEHB project. Here are the guidelines for documenting your code:
+
+### General Guidelines
+
+- **New Features:** All new features must include documentation.
+- **Docstrings:** All public functions must include docstrings that follow the [Google style guide](https://google.github.io/styleguide/pyguide.html).
+- **Comments:** Use comments to explain the logic behind complex code, special cases, or non-obvious implementations.
+- **Clarity:** Ensure that your comments and docstrings are clear, concise, and informative.
+
+### Docstring Requirements
+
+For each public function, the docstring should include:
+
+1. **Summary:** A brief description of the function's purpose.
+2. **Parameters:** A list of all parameters with descriptions, including types and any default values.
+3. **Returns:** A description of the return values, including types.
+4. **Raises:** A list of any exceptions that the function might raise.
+
+### Example Docstring
+
+```python
+def example_function(param1: int, param2: str = "default") -> bool:
+    """
+    This is an example function that demonstrates how to write a proper docstring.
+
+    Args:
+        param1 (int): The first parameter, an integer.
+        param2 (str, optional): The second parameter, a string. Defaults to "default".
+
+    Returns:
+        bool: The return value. True if successful, False otherwise.
+
+    Raises:
+        ValueError: If `param1` is negative.
+    """
+    if param1 < 0:
+        raise ValueError("param1 must be non-negative")
+    return True
+```
+
+### Rendering Documentation Locally
+
+To render the documentation locally for debugging and review:
+
+1. Install the required `dev` dependencies:
+
+    ```bash
+    pip install -e .[dev]
+    ```
+
+2. Use `mike` to deploy and serve the documentation locally:
+
+    ```bash
+    mike deploy --update-aliases 2.0.0 latest --ignore
+    mike serve
+    ```
+
+3. The docs should now be viewable on http://localhost:8000/. If not, check your command prompt for any errors (or different local server adress).
+
 ## Community Guidelines
 
 When participating in the DEHB community, please adhere to the following guidelines:

diff --git a/README.md b/README.md
@@ -38,7 +38,7 @@ optimizer.tell(job_info, result)
 
 ##### Using run()
 # Run optimization for 1 bracket. Output files will be saved to ./logs
-traj, runtime, history = optimizer.run(brackets=1, verbose=True)
+traj, runtime, history = optimizer.run(brackets=1)
 ```
 
 #### Running DEHB in a parallel setting
@@ -66,30 +66,6 @@ For more details and features, please have a look at our [documentation](https:/
 ### Contributing
 Any contribution is greaty appreciated! Please take the time to check out our [contributing guidelines](./CONTRIBUTING.md)
 
-### DEHB Hyperparameters
-
-*We recommend the default settings*.
-The default settings were chosen based on ablation studies over a collection of diverse problems 
-and were found to be *generally* useful across all cases tested. 
-However, the parameters are still available for tuning to a specific problem.
-
-The Hyperband components:
-* *min\_fidelity*: Needs to be specified for every DEHB instantiation and is used in determining 
-the fidelity spacing for the problem at hand.
-* *max\_fidelity*: Needs to be specified for every DEHB instantiation. Represents the full-fidelity 
-evaluation or the actual black-box setting.
-* *eta*: (default=3) Sets the aggressiveness of Hyperband's aggressive early stopping by retaining
-1/eta configurations every round
-
-The DE components:
-* *strategy*: (default=`rand1_bin`) Chooses the mutation and crossover strategies for DE. `rand1` 
-represents the *mutation* strategy while `bin` represents the *binomial crossover* strategy. \
-  Other mutation strategies include: {`rand2`, `rand2dir`, `best`, `best2`, `currenttobest1`, `randtobest1`}\
-  Other crossover strategies include: {`exp`}\
-  Mutation and crossover strategies can be combined with a `_` separator, for e.g.: `rand2dir_exp`.
-* *mutation_factor*: (default=0.5) A fraction within [0, 1] weighing the difference operation in DE
-* *crossover_prob*: (default=0.5) A probability within [0, 1] weighing the traits from a parent or the mutant
-
 ---
 
 ### To cite the paper or code

diff --git a/benchmarking/BENCHMARKING.md b/benchmarking/BENCHMARKING.md
@@ -0,0 +1,102 @@
+# Benchmarking DEHB
+
+Benchmarking DEHB is crucial for ensuring consistent performance across different setups and configurations. We aim to benchmark DEHB on multiple HPOBench-benchmarks and MFPBench-benchmarks with different run setups:
+
+1. Using `dehb.run`,
+2. Using the Ask & Tell interface and
+3. Restarting the optimization run after half the budget.
+
+In the end, the results for the 3 different execution setups should be the same. With this setup guide, we encourage the developers of DEHB to continually benchmark their changes in order to ensure, that
+
+- the inner workings of DEHB are not corrupted by checking the different execution setup results and
+- that overall performance either remains the same, if no algortihmic changes have been made or is still comparable/better, if algorithmic changes have been made.
+
+Please follow the installtion guide below, to benchmark your changes.
+
+## Installation Guide HPOBench
+
+The following guide walks you through installing hpobench and running the benchmarking script. Here, we assume that you execute the commands in your cloned DEHB repository.
+
+### Create Virtual Environment
+
+Before starting, please make sure you have clean virtual environment using python 3.8 ready. The following commands walk you through on how to do this with conda.
+
+```shell
+conda create --name dehb_hpo python=3.8
+conda activate dehb_hpo
+```
+
+### Installing HPOBench
+
+```shell
+git clone https://github.com/automl/HPOBench.git
+cd HPOBench
+git checkout 47bf141 # Checkout specific commit
+pip install .[ml_tabular_benchmarks]
+cd ..
+```
+
+### Installing DEHB
+
+There are some additional dependencies needed for plotting and table generation, therefore please install DEHB with the benchmarking options:
+
+```shell
+pip install -e .[benchmarking,hpobench_benchmark]
+```
+
+### Running the Benchmarking Script
+
+The benchmarking script is highly configurable and lets you choose between the budget types (`fevals`, `brackets` and `total_cost`), the execution setup (`run`(default), `ask_tell` and `restart`), the benchmarks used (`tab_nn`, `tab_rf`, `tab_svm`, `tab_lr`, `surrogate`, `nasbench201`) and the seeds used for each benchmark run (default: [0]).
+
+```shell
+python3.8 benchmarking/hpobench_benchmark.py --fevals 300 --benchmarks tab_nn tab_rf tab_svm tab_lr surrogate nasbench201 --seed 0 --n_seeds 5 --output_path logs/hpobench_benchmarking
+```
+
+## Installation Guide MFPBench
+
+The following guide walks you trough instaling mfpbench and running the benchmarking script. Here, we assume that you execute the commands in your cloned DEHB repository.
+
+## PD1 Benchmark and MFHartmann
+
+### Create Virtual Environment
+
+Before starting, please make sure you have clean virtual environment using python 3.8 ready. The following commands walk you through on how to do this with conda.
+
+```shell
+conda create --name dehb_pd1 python=3.8
+conda activate dehb_pd1
+```
+
+### Installing DEHB with MFPBench
+
+There are some additional dependencies needed for plotting and table generation, therefore please install DEHB with the benchmarking options:
+
+```shell
+pip install -e .[benchmarking,pd1_benchmark]
+```
+
+### Downloading Benchmark Data
+
+In order to run the benchmark, first we need to download the benchmark data:
+
+```shell
+python -m mfpbench download --benchmark pd1
+```
+
+### Running the Benchmarking Script
+
+We currently support and use the PD1 benchmarks `cifar100_wideresnet_2048`, `imagenet_resnet_512`, `lm1b_transformer_2048` and `translatewmt_xformer_64`. Moreover, the `mfh3` and `mfh6` benchmarks are available.
+
+```shell
+python3.8 benchmarking/mfpbench_benchmark.py --fevals 300 --benchmarks  mfh3 mfh6 cifar100_wideresnet_2048 imagenet_resnet_512 lm1b_transformer_2048 translatewmt_xformer_64 mfh3 mfh6 --seed 0 --n_seeds 5 --output_path logs/pd1_benchmarks
+```
+
+## CountingOnes Benchmark
+
+The CountingOnes benchmark is a synthetical benchmark and only depends on numpy, thus it can be used directly without any special setup.
+
+### Running the Benchmarking Script
+
+```shell
+python benchmarking/countingones_benchmark.py --seed 0 --n_seeds 5 --fevals 300 --output_path logs/countingones --n_continuous 50 --n_categorical 50
+```
diff --git a/benchmarking/benchmarking.sh b/benchmarking/benchmarking.sh
@@ -0,0 +1,50 @@
+#!/bin/bash
+#SBATCH -p bosch_cpu-cascadelake
+#SBATCH -o logs/%A[%a].%N.out       # STDOUT  (the folder log has to exist) %A will be replaced by the SLURM_ARRAY_JOB_ID value
+#SBATCH -e logs/%A[%a].%N.err       # STDERR  (the folder log has to exist) %A will be replaced by the SLURM_ARRAY_JOB_ID value
+#SBATCH -J DEHB_benchmarking              # sets the job name. 
+#SBATCH -a 1-3 # array size
+#SBATCH -t 0-00:30:00
+#SBATCH --mem 16GB
+
+BUDGET=300
+
+# Print some information about the job to STDOUT
+echo "Workingdir: $(pwd)";
+echo "Started at $(date)";
+echo "Benchmarking DEHB on multiple benchmarks";
+echo "Running job $SLURM_JOB_NAME using $SLURM_JOB_CPUS_PER_NODE cpus per node with given JID $SLURM_JOB_ID on queue $SLURM_JOB_PARTITION";
+
+source ~/.bashrc
+
+if [ 1 -eq $SLURM_ARRAY_TASK_ID ]
+then
+    conda activate dehb_pd1
+    pip install .
+
+    python benchmarking/mfpbench_benchmark.py --seed 0 --n_seeds 5 --fevals $BUDGET --benchmarks mfh3 mfh6 cifar100_wideresnet_2048 imagenet_resnet_512 lm1b_transformer_2048 --output_path logs/pd1
+    # Due to memory problems
+    python benchmarking/mfpbench_benchmark.py --seed 0 --n_seeds 5 --fevals $BUDGET --benchmarks translatewmt_xformer_64 --output_path logs/pd1
+
+    python benchmarking/generate_summary.py
+elif [ 2 -eq $SLURM_ARRAY_TASK_ID ]
+then
+    conda activate dehb_hpo
+    pip install .
+
+    python benchmarking/hpobench_benchmark.py --seed 0 --n_seeds 5 --fevals $BUDGET --benchmarks tab_nn tab_rf tab_svm tab_lr surrogate nasbench201 --output_path logs/hpob
+
+    python benchmarking/generate_summary.py
+elif [ 3 -eq $SLURM_ARRAY_TASK_ID ]
+then
+    sleep 60 # Wait for dehb_pd1 to install dehb properly
+    conda activate dehb_pd1 # CountingOnes works with any dependencies, since it is only dependent on numpy
+
+    python benchmarking/countingones_benchmark.py --seed 0 --n_seeds 5 --fevals $BUDGET --output_path logs/countingones --n_continuous 50 --n_categorical 50
+
+    python benchmarking/generate_summary.py
+fi
+
+# Print some Information about the end-time to STDOUT
+echo "DONE";
+echo "Finished at $(date)";