Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Michalewicz benchmark #464

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions benchmarks/domains/__init__.py
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we add the global optimum as dashed line to the simulation plot you showed? Results without that can be deceiving

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That should be the case when the code is being run in the dashboard as you can see here for the current benchmark https://baybe-benchmark.apps.p.uptimize.merckgroup.com/ (@fabianliebig please confirm)
The plot I posted here was just meant as a verification that the chosen DoE and MC numbers are somewhat reasonable.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, can confirm. The stored result contain that value and will draw a horizontal dashed line.

Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,11 @@

from benchmarks.definition.config import Benchmark
from benchmarks.domains.synthetic_2C1D_1C import synthetic_2C1D_1C_benchmark
from benchmarks.domains.synthetic_michalewicz import synthetic_michalewicz_benchmark

BENCHMARKS: list[Benchmark] = [
synthetic_2C1D_1C_benchmark,
synthetic_michalewicz_benchmark,
]

__all__ = ["BENCHMARKS"]
130 changes: 130 additions & 0 deletions benchmarks/domains/synthetic_michalewicz.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,130 @@
"""5-dimensional Michalewicz function in a continuous space."""

from __future__ import annotations

from typing import TYPE_CHECKING

import numpy as np
import pandas as pd
from numpy import pi, sin
from pandas import DataFrame

from baybe.campaign import Campaign
from baybe.parameters import NumericalContinuousParameter
from baybe.recommenders import RandomRecommender
from baybe.searchspace import SearchSpace
from baybe.simulation import simulate_scenarios
from baybe.targets import NumericalTarget
from benchmarks.definition import (
Benchmark,
ConvergenceExperimentSettings,
)

if TYPE_CHECKING:
from mpl_toolkits.mplot3d import Axes3D


def _lookup(arr: np.ndarray, /) -> np.ndarray:
"""Numpy-based lookup callable defining the objective function."""
try:
assert np.all((arr >= 0) & (arr <= pi))
except AssertionError:
raise ValueError("Inputs are not in the valid ranges.")
x1, x2, x3, x4, x5 = np.array_split(arr, 5, axis=1)

return -(
sin(x1) * sin(1 * x1**2 / pi) ** (2 * 10)
+ sin(x2) * sin(2 * x2**2 / pi) ** (2 * 10)
+ sin(x3) * sin(3 * x3**2 / pi) ** (2 * 10)
+ sin(x4) * sin(4 * x4**2 / pi) ** (2 * 10)
+ sin(x5) * sin(5 * x5**2 / pi) ** (2 * 10)
)
Comment on lines +35 to +41
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just as a suggestion: Botorch also offers many typically used functions for benchmarking under the submodule botorch.test_functions.synthetic. Maybe thats interesting to save some time :)



def lookup(df: pd.DataFrame, /) -> pd.DataFrame:
"""Dataframe-based lookup callable used as the loop-closing element."""
return pd.DataFrame(
_lookup(df[["x1", "x2", "x3", "x4", "x5"]].to_numpy()),
columns=["target"],
index=df.index,
)


def synthetic_michalewicz(settings: ConvergenceExperimentSettings) -> DataFrame:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a 5d to the name to make the dimensionality clear?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea will do

"""5-dimensional Michalewicz function.

Details of the function can be found at https://www.sfu.ca/~ssurjano/michal.html

Comment on lines +56 to +57
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I generally like the idea of linking further information and make the description reasonably interactive. As an idea: Maybe we consider to use markdown for this at some point. Uncommon for the docstring itself but might add some glamor to the dashboard :D

Inputs:
x1,...,x5 continuous [0, pi]
Output: continuous
Objective: Minimization
Optimal Input:
{x1: 2.203, x2: 1.571, x3: 1.285, x4: 1.923, x5: 1.720e}
Optimal Output: -4.687658
"""
parameters = [
NumericalContinuousParameter(name=f"x{i}", bounds=(0, pi)) for i in range(1, 6)
]

target = NumericalTarget(name="target", mode="MIN")
searchspace = SearchSpace.from_product(parameters=parameters)
objective = target.to_objective()

scenarios: dict[str, Campaign] = {
"Random Recommender": Campaign(
searchspace=searchspace,
recommender=RandomRecommender(),
objective=objective,
),
"Default Recommender": Campaign(
searchspace=searchspace,
objective=objective,
),
}

return simulate_scenarios(
scenarios,
lookup,
batch_size=settings.batch_size,
n_doe_iterations=settings.n_doe_iterations,
n_mc_iterations=settings.n_mc_iterations,
impute_mode="error",
)


benchmark_config = ConvergenceExperimentSettings(
batch_size=5,
n_doe_iterations=25,
n_mc_iterations=20,
)

synthetic_michalewicz_benchmark = Benchmark(
function=synthetic_michalewicz,
best_possible_result=-4.687658,
settings=benchmark_config,
)

if __name__ == "__main__":
# Visualization of the 2-dimensional variant

import matplotlib.pyplot as plt

X1 = np.linspace(0, pi, 50)
X2 = np.linspace(0, pi, 50)
X1, X2 = np.meshgrid(X1, X2)

# Michalewicz function
Z = -1 * (
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

feel to me you should reuse the 5d variant shown above, just lot a 2d slice (or several)

reason: if theres a problem int he function above this plot wouldnt reveal it

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will try my best :D

(np.sin(X1) * np.sin((1 * X1**2) / np.pi) ** 20)
+ (np.sin(X2) * np.sin((2 * X2**2) / np.pi) ** 20)
)

ax: Axes3D = plt.figure().add_subplot(projection="3d")
surf = ax.plot_surface(X1, X2, Z)

ax.set_xlabel("x1", fontsize=10)
ax.set_ylabel("x2", fontsize=10)
ax.tick_params(axis="both", which="major", labelsize=6)

plt.show()
Loading