Skip to content

Commit

Permalink
Add manual benchmark workflow and S3 result persistence (#429)
Browse files Browse the repository at this point in the history
Hi everyone,

This pull request introduces a class and a workflow designed to store
the results of a benchmark run on an S3 bucket.

The key used for storage includes the identifier for the benchmark
itself, the branch used, the release version, the current date and the
commit hash in that order. Furthermore, boto3 package is added to
interact with AWS components.

I look forward to your feedback.
  • Loading branch information
fabianliebig authored Nov 27, 2024
2 parents db5513a + d260235 commit f4669a2
Show file tree
Hide file tree
Showing 14 changed files with 471 additions and 12 deletions.
54 changes: 54 additions & 0 deletions .github/workflows/manual_benchmark.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
name: Run Benchmark

on:
workflow_dispatch:

permissions:
contents: read
id-token: write

jobs:
add-runner:
runs-on: ubuntu-latest
steps:
- name: Generate a token
id: generate-token
uses: actions/create-github-app-token@v1
with:
app-id: ${{ vars.APP_ID }}
private-key: ${{ secrets.APP_PRIVATE_KEY }}
- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: ${{ secrets.AWS_ROLE_TO_ASSUME }}
role-session-name: Github_Add_Runner
aws-region: eu-central-1
- name: Login to Amazon ECR
id: login-ecr
uses: aws-actions/amazon-ecr-login@v2
- name: Execute Lambda function
run: |
aws lambda invoke --function-name jit_runner_register_and_create_runner_container --cli-binary-format raw-in-base64-out --payload '{"github_api_secret": "${{ steps.generate-token.outputs.token }}", "count_container": 1, "container_compute": "XL", "repository": "${{ github.repository }}" }' response.json
cat response.json
if ! grep -q '"statusCode": 200' response.json; then
echo "Lambda function failed. statusCode is not 200."
exit 1
fi
benchmark-test:
needs: add-runner
runs-on: self-hosted
env:
BAYBE_BENCHMARKING_PERSISTENCE_PATH: ${{ secrets.TEST_RESULT_S3_BUCKET }}
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- uses: actions/setup-python@v5
id: setup-python
with:
python-version: "3.10"
- name: Benchmark
run: |
pip install '.[benchmarking]'
python -m benchmarks
17 changes: 16 additions & 1 deletion .lockfiles/py310-dev.lock
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,12 @@ blinker==1.8.2
# via streamlit
boolean-py==4.0
# via license-expression
boto3==1.35.68
# via baybe (pyproject.toml)
botocore==1.35.68
# via
# boto3
# s3transfer
botorch==0.11.3
# via baybe (pyproject.toml)
cachecontrol==0.14.0
Expand Down Expand Up @@ -264,6 +270,10 @@ jinja2==3.1.4
# pydeck
# sphinx
# torch
jmespath==1.0.1
# via
# boto3
# botocore
joblib==1.4.2
# via
# baybe (pyproject.toml)
Expand Down Expand Up @@ -700,6 +710,7 @@ pytest-cov==5.0.0
python-dateutil==2.9.0.post0
# via
# arrow
# botocore
# jupyter-client
# matplotlib
# pandas
Expand Down Expand Up @@ -768,6 +779,8 @@ rpds-py==0.19.0
# referencing
ruff==0.5.2
# via baybe (pyproject.toml)
s3transfer==0.10.4
# via boto3
scikit-fingerprints==1.9.0
# via baybe (pyproject.toml)
scikit-learn==1.5.1
Expand Down Expand Up @@ -985,7 +998,9 @@ tzdata==2024.1
uri-template==1.3.0
# via jsonschema
urllib3==2.2.2
# via requests
# via
# botocore
# requests
uv==0.3.0
# via
# baybe (pyproject.toml)
Expand Down
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- `benchmarks` subpackage for defining and running performance tests
`Campaign.toggle_discrete_candidates` to dynamically in-/exclude discrete candidates
- `DiscreteConstraint.get_valid` to conveniently access valid candidates
- Functionality for persisting benchmarking results on S3 from a manual pipeline run

### Changed
- `SubstanceParameter` encodings are now computed exclusively with the
Expand Down
2 changes: 1 addition & 1 deletion CONTRIBUTORS.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,4 +29,4 @@
- Karin Hrovatin (Merck KGaA, Darmstadt, Germany):\
`scikit-fingerprints` support
- Fabian Liebig (Merck KGaA, Darmstadt, Germany):\
Benchmarking structure
Benchmarking structure and persistence capabilities for benchmarking results
85 changes: 85 additions & 0 deletions benchmarks/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
This module contains benchmarks meant to test the performance of BayBE for
pre-defined tasks. The benchmarks can be executed as a whole by executing
the following command:

```bash
python -m benchmarks
```

# `Benchmark`

The `Benchmark` object is the combination of all benchmark related data.
At the heart is the callable `function`, used to perform and hide the
benchmarked code. The `name` serves as the unique identifier of the benchmark. Note that
this identifier is also used for storing a `Result`. Therefore, any change will be
considered a new benchmark. The `function`s `__doc__` is used to
automatically set the `description`. A full code example can be found in the
`domains/synthetic_2C1D_1C.py` file.

# `BenchmarkSettings`

The `BenchmarkSettings` object is used to parameterize the benchmark `function`.
It is an abstract base class that can be extended by the user to provide
additional information. The only required attribute is
`random_seed`, which is used to seed the entire call of the benchmark `function`.
Currently, the following settings are available:

## `ConvergenceExperimentSettings`

The `ConvergenceExperimentSettings` object is used to parameterize the
convergence experiment benchmarks and holds information used for BayBE scenario
executions. Please refer to the BayBE documentation for more information
about the [simulations subpackage](baybe.simulation).

# `Result`

The `Result` object encapsulates all execution-relevant information of the `Benchmark`
and represents the `Result` of the benchmark `function`, along with state information
at the time of execution.

## `ResultMetadata`

The `ResultMetadata` is the wrapper to hold the described information about the
`Benchmark` at runtime. A combination of the benchmark identifier and the metadata
is meant to describe the conducted `Result` uniquely under the assumption that equal
benchmarked code states are also equally representative due to the fixed random seed.

# Add your benchmark to the benchmarking module

In the last step, your benchmark object has to be added to the
`benchmarks module`. This is done by adding the object to the `BENCHMARKS`
list in the `__init__.py` file in the `domains` folder. The `BENCHMARKS` contains all
objects that should be called when running the `benchmarks module`.

# Persisting Results

`Result`s are stored automatically. Since multiple storage types are provided with
different requirements and compatibilities, the `PathConstructor` class is used to
construct the identifier for the file. For example `S3ObjectStorage` is used to
store the `Result`s in an S3 bucket which separates the key by `/` but does not create
real folders while the usual local persistence creates a file with a `_` so that folder
creation is not necessary. The class handling the storage of the resulting object get
this `PathConstructor` and use it in the way it needs the identifier to be.
The following types of storage are available:

## `LocalFileObjectStorage`

Stores a file on the local file system and will automatically be chosen when calling
the `benchmarks module` if it does not run in the CI/CD pipeline. A prefix folder path can be
provided when creating the object. The file will be stored in the current working
directory if no prefix is provided. The file will be stored in the following format
with the prefix:
`<PREFIX_PATH>/<benchmark_name>_<branch>_<latest_baybe_tag>_<execution-date>_<commit_hash>_result.json`.

## `S3ObjectStorage`

Stores a file in an S3 bucket and will automatically be chosen when calling the
`benchmarks module` if it runs in the CI/CD pipeline. For locating the S3-Bucket to
persist, the environment variable `BAYBE_BENCHMARKING_PERSISTENCE_PATH` must be set
with its name. For running the `benchmarks module` in the CI/CD pipeline,
there must be also the possibility to assume a AWS role from a job call.
This is done by providing the roles ARN in the secret `AWS_ROLE_TO_ASSUME`.
For creating temporary credentials, a GitHub App will be used.
To generated a token, the id of the GitHub App and its secret key must be provided in
the secrets `APP_ID` and `APP_PRIVATE_KEY`. The file will be stored in the following
format: `<benchmark_name>/<branch>/<latest_baybe_tag>/<execution-date>/<commit_hash>/result.json`.
16 changes: 15 additions & 1 deletion benchmarks/__main__.py
Original file line number Diff line number Diff line change
@@ -1,13 +1,27 @@
"""Executes the benchmarking module."""
# Run this via 'python -m benchmarks' from the root directory.

import os

from benchmarks.domains import BENCHMARKS
from benchmarks.persistence import (
LocalFileObjectStorage,
PathConstructor,
S3ObjectStorage,
)

RUNS_IN_CI = "CI" in os.environ


def main():
"""Run all benchmarks."""
for benchmark in BENCHMARKS:
benchmark()
result = benchmark()
path_constructor = PathConstructor.from_result(result)
persist_dict = benchmark.to_dict() | result.to_dict()

object_storage = S3ObjectStorage() if RUNS_IN_CI else LocalFileObjectStorage()
object_storage.write_json(persist_dict, path_constructor)


if __name__ == "__main__":
Expand Down
19 changes: 16 additions & 3 deletions benchmarks/definition/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,15 +8,16 @@

from attrs import define, field
from attrs.validators import instance_of
from cattr.gen import make_dict_unstructure_fn, override
from pandas import DataFrame

from baybe.serialization.mixin import SerialMixin
from baybe.utils.random import temporary_seed
from benchmarks.result import Result, ResultMetadata
from benchmarks.serialization import BenchmarkSerialization, converter


@define(frozen=True)
class BenchmarkSettings(SerialMixin, ABC):
class BenchmarkSettings(ABC, BenchmarkSerialization):
"""Benchmark configuration for recommender analyses."""

random_seed: int = field(validator=instance_of(int), kw_only=True, default=1337)
Expand All @@ -41,7 +42,7 @@ class ConvergenceExperimentSettings(BenchmarkSettings):


@define(frozen=True)
class Benchmark(Generic[BenchmarkSettingsType]):
class Benchmark(Generic[BenchmarkSettingsType], BenchmarkSerialization):
"""The base class for a benchmark executable."""

settings: BenchmarkSettingsType = field()
Expand Down Expand Up @@ -88,3 +89,15 @@ def __call__(self) -> Result:
)

return Result(self.name, result, metadata)


# Register un-/structure hooks
converter.register_unstructure_hook(
Benchmark,
lambda o: dict(
{"description": o.description},
**make_dict_unstructure_fn(Benchmark, converter, function=override(omit=True))(
o
),
),
)
9 changes: 9 additions & 0 deletions benchmarks/persistence/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
"""Module for persisting benchmarking results."""

from benchmarks.persistence.persistence import (
LocalFileObjectStorage,
PathConstructor,
S3ObjectStorage,
)

__all__ = ["PathConstructor", "S3ObjectStorage", "LocalFileObjectStorage"]
Loading

0 comments on commit f4669a2

Please sign in to comment.