Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplicate forbiddens in ConfigurationSpace #354

Open
aron-bram opened this issue Apr 9, 2024 · 1 comment
Open

Duplicate forbiddens in ConfigurationSpace #354

aron-bram opened this issue Apr 9, 2024 · 1 comment

Comments

@aron-bram
Copy link

I noticed it was possible to add the same forbidden clause multiple times to the same space. Not sure if this is intended or not, so I thought I would raise a quick issue for it.

Here is how to replicate it:

import ConfigSpace as CS
import ConfigSpace.hyperparameters as CSH

cs = CS.ConfigurationSpace()

lower_bound = CSH.UniformIntegerHyperparameter('lower_bound', lower=0, upper=10)
upper_bound = CSH.UniformIntegerHyperparameter('upper_bound', lower=0, upper=10)

cs.add_hyperparameter(lower_bound)
cs.add_hyperparameter(upper_bound)

# add duplicate forbiddens to the same space
fgt1 = CS.ForbiddenGreaterThanRelation(lower_bound, upper_bound)
fgt2 = CS.ForbiddenGreaterThanRelation(lower_bound, upper_bound)
fgt3 = CS.ForbiddenGreaterThanRelation(lower_bound, upper_bound)

cs.add_forbidden_clause(fgt1)
cs.add_forbidden_clause(fgt2)
cs.add_forbidden_clause(fgt2)

print(cs)

What it outputs:

Configuration space object:
  Hyperparameters:
    lower_bound, Type: UniformInteger, Range: [0, 10], Default: 5
    upper_bound, Type: UniformInteger, Range: [0, 10], Default: 5
  Forbidden Clauses:
    Forbidden: lower_bound > upper_bound
    Forbidden: lower_bound > upper_bound
    Forbidden: lower_bound > upper_bound

Notice how the same forbidden appears 3 times.

@eddiebergman
Copy link
Contributor

Hmmm, good find. It's not really problematic in the sense that it causes logical issue, however it could make sampling a bit slower if you need to generate large amount of samples, fast.

What would you expect the behavior be? I have 3 different ideas:

  • Raise an error, you can not have duplicates.
  • Raise a warning, warning that there are duplicates.
  • Do nothing, but optimize this away underneath the hood. We currently do this for AND conjunctions that share individual parts as this comes up a lot in some existing systems.
(classifier== 'adaboost' && preprocessor== 'densifier'),
(classifier== 'adaboost' && preprocessor== 'kitchen_sinks'),
(classifier== 'adaboost' && preprocessor == 'nystroem_sampler')
(classifier== 'adaboost'
&& preprocessor in ('nystroem_sampler', 'kitchen_sinks', 'densifier'))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants