-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ForbiddenCallableRelation #280
base: main
Are you sure you want to change the base?
Conversation
Hi @nchristensen, While this is quite cool, and definitely something that's desirable, it makes somethings impossible which is something we definitely want to keep, namely serializibility. We can't save configspaces with callables to JSON. Another example, close to ConfigSpace is In light of this, I don't want to outright reject the PR, it's still nice to have but I imagine it will require some explicit testing with respect to serialization and we need explicit documentation that use of this feature would prevent non-binary serialization to disk. (You could still pickle them of course but pickles are a bit volatile with long term storage). I'll review this later today and drop some pointers as to where this could be done! |
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## main #280 +/- ##
==========================================
+ Coverage 67.97% 68.05% +0.08%
==========================================
Files 25 25
Lines 1786 1800 +14
==========================================
+ Hits 1214 1225 +11
- Misses 572 575 +3
☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So functionally, it all seems there but as mentioned in the PR comment, we definitely need tests to check its behaviour when conisdering serialization.
My guess is serialization will fail and give some JSON based error. If this is the case. To counteract this, I think we should raise an explicit error stating that using a ForbiddenCallable
means you can not serialize the space to JSON. This should also be stated in the documentation directly.
Thanks for the PR though, sorry for my critical comments. We do appreciate it, really and I'm sorry we don't dedicate more time to directly improving this library and user contributions are a nice surprise!
ConfigSpace/forbidden.pyx
Outdated
left : :ref:`Hyperparameters` | ||
first argument of callable | ||
|
||
right : :ref:`Hyperparameters` | ||
second argument of callable | ||
|
||
f : A callable that relates the two hyperparameters |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sphinx will complain about the documentation having a leading space on this line unfortunately, it's very particular :/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My knowledge of Sphinx is limited, but I revised it modeled on the ForbiddenEqualsRelation. Hopefully that resolves this.
def __eq__(self, other: Any) -> bool: | ||
if not isinstance(other, self.__class__): | ||
return False | ||
return super().__eq__(other) and self.f == other.f |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice use of the super class to keep consistency in equality checking.
ConfigSpace/forbidden.pyx
Outdated
def __repr__(self): | ||
from inspect import getsource | ||
f_source = getsource(self.f) | ||
return f"Forbidden: {f_source} | Arguments: {self.left.name}, {self.right.name}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This getsource
call could be quite large and rather verbose. For example:
def f():
print("hello")
x = 1 + 2
y = "hi" + "mars"
return b"no"
from inspect import getsource
getsource(f)
# 'def f():\n print("hello")\n x = 1 + 2\n y = "hi" + "mars"\n return b"no"\n'
I think we could just get around this by using the functions, f.__qualname__
instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a follow up you could accomodate lambdas as:
qualname = f.__qualname__
f_repr = getsource(f) if qualname == "<lambda>" else qualname
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good idea. I was noticing formatting issues using getsource
as well since it keeps all of the leading white space.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Obtaining only the source code of the lambda (and discarding any other code on the same line) is apparently non-trivial. See for example https://stackoverflow.com/questions/59498679/how-can-i-get-exactly-the-code-of-a-lambda-function-in-python. I have revised it to use the qualname though.
f_source = getsource(self.f) | ||
return f"Forbidden: {f_source} | Arguments: {self.left.name}, {self.right.name}" | ||
|
||
cdef int _is_forbidden(self, left, right) except -1: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this except -1
do? I wish I knew more Cython but unfortunatly not. I looked at the class above and it didn't seem to have this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I modeled this off of the ForbiddenLessThanRelation
directly below which uses this. I'm not very familiar with Cython either, but it seems any cdef function that might return a Python exception needs to be declared with an except value. https://docs.cython.org/en/latest/src/userguide/language_basics.html#error-return-values
sigma = self.sigma | ||
if sigma == 0: | ||
return self.mu | ||
elif self.lower == None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a nice change in of itself. Depending on how this PR goes, I still think we'd like to pull this in
ConfigSpace/hyperparameters.pyx
Outdated
if self.sigma == 0: | ||
assert isinstance(self.mu, int) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this assertion? I feel like they should both just be int
. The init signature seems off here to suggest otherwise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This (or some form of rounding) is needed if a non-integer mean is allowed. But I'm also fine with restoring the existing behavior if allowing a non-integer mean is undesirable.
@@ -1663,8 +1668,7 @@ cdef class NormalIntegerHyperparameter(IntegerHyperparameter): | |||
cdef public nfhp | |||
cdef normalization_constant | |||
|
|||
|
|||
def __init__(self, name: str, mu: int, sigma: Union[int, float], | |||
def __init__(self, name: str, mu: Union[int, float], sigma: Union[int, float], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems you explicitly type allow float
? How come?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My thinking is that even if we're drawing integers from a normal distribution, there is no requirement that the mean of the distribution be an integer. For instance, a user might want values <= 4 and values >= 5 to be chosen with equal probability and so place the mean at 4.5.
Yeah, serializability could be difficult with this. I think it would be useful to optionally allow pickling of the Callable during serialization, but to disallow this by default. |
Hey @nchristensen thank you very much for the PR. I'll discuss with @eddiebergman whether we will extend the ConfigSpace package to allow callables (no timeline, though). However, your changes to the hyperparameter file appear to be useful by themselves (as mentioned by @eddiebergman). Would you like to create a single PR so we can merge these anyway? |
Sure, |
… and round the result to the nearest integer
…ization bug fixes, another fix for float mu in NormalInteger space
* test: Add reproducing test * fix: Make sampling neighbors form uniform Int stable * fix: Memory leak with UniformIntegerHyperparameter When querying a large range for a UniformIntegerHyperparameter with a small std.deviation and log scale, this could cause an infinite loop as the reachable neighbors would be quickly exhausted, yet rejection sampling will continue sampling until some arbitrary termination criterion. Why this was causing a memory leak, I'm not entirely sure. The solution now is that is we have seen a sampled value before, we simply take the one "next to it". * fix: Memory issues with Normal and Beta dists Replaced usages of arange with a chunked version to prevent memory blowup. However this is still incredibly slow and needs a more refined solution as a huge amount of values are required to be computed for what can possibly be analytically derived. * chore: Update flake8 * fix: flake8 version compatible with Python 3.7 * fix: Name generators properly * fix: Test numbers * doc: typo fixes * perf: Generate all possible neighbors at once * test: Add test for center_range and arange_chunked * perf: Call transform on np vector from rvs * perf: Use numpy `.astype(int)` instead of `int` * doc: Document how to get flamegraphs for optimizing * fix: Allow for negatives in arange_chunked again * fix: Change build back to raw Extensions * build: Properly set compiler_directives * ci: Update makefile with helpful commands * ci: Fix docs to install build * perf: cython optimizations * perf: Fix possible memory leak with UniformIntegerHyperparam * fix: Duplicates as `list` instead of set * fix: Convert to `long long` vector * perf: Revert clip to truncnorm This truncnorm has some slight overhead due to however scipy generates its truncnorm distribution, however this overhead is considered worth it for the sake of readability and understanding * test: Test values not match implementation * Intermediate commit * INtermediate commit 2 * Update neighborhood generation for UniformIntegerHyperparameter * Update tests * Make the benchmark sampling script more robust * Revert small change in util function * Improve readability Co-authored-by: Matthias Feurer <[email protected]>
The builds wheels for Python 3.11, as well as disable wheel builds for win32 and i686 architectures due to scipy not distributing wheels for these in their latest versions. * feat: python 3.11 wheels * ci: trigger workflow * ci: update cibuildwheel for python 3.11 * ci: update other cibuildwheels * ci: disable >=3.8 win32 wheels * ci: Remove debug trigger
2c499de
to
f4e8758
Compare
Allows for more complicated forbidden relationships using Callables. Related issues #254 #272 #277.
Also adds support for sigma=0 in
NormalFloatHyperparameter
andNormalIntegerHyperparameter
and non-integer mu (mean) values inNormalIntegerHyperparameter
as well as fixes some json serialization bugs.