Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement sampling function for hyperparameters and held-out workloads #540

Closed
fsschneider opened this issue Oct 10, 2023 · 3 comments
Closed
Labels
P1 Launch 2023 High priority issues for October 2023 AlgoPerf Launch

Comments

@fsschneider
Copy link
Contributor

fsschneider commented Oct 10, 2023

Description

We need two functions:

  • One that, given a seed provided by a "trusted third party" samples the held-out workloads (one for each dataset) to use in the competition.
  • One function that, given a seed and a submission ID, provides hyperparameter seeds, grouped by study, I.e. 5 groups of 5 hyperparameter configurations.
@fsschneider fsschneider added the P1 Launch 2023 High priority issues for October 2023 AlgoPerf Launch label Oct 10, 2023
@priyakasimbeg
Copy link
Contributor

priyakasimbeg commented Jan 19, 2024

@fsschneider I'm not sure if I understand the second sampling function. Please correct me if this is wrong; assuming the hyperparameter configurations correspond to tuning trials (which should be 5 trials now), what do the studies correspond to?

@priyakasimbeg priyakasimbeg mentioned this issue Jan 25, 2024
4 tasks
@priyakasimbeg
Copy link
Contributor

priyakasimbeg commented Jan 30, 2024

@fsschneider, @georgedahl just noticed our technical documentation says: "After the submission deadline, one held-out workload will be sampled for each fixed workload".
Can you clarify whether we will sample 1 held-out workload per dataset (6 total) or 1 held-out workload per base workload (8 total)?
Upon further reflection it seems like we may want 1 per workload since. If we only have 6 then we'd have to either

  1. apply the held-out workload scoring criteria to the two base workloads for which no variants were sampled by using the variants that were sampled for the other 2 workload that use the same dataset.
  2. not apply the held-out workload scoring criteria to the two base workloads for which no variants were sampled.

@priyakasimbeg
Copy link
Contributor

From offline thread it looks like we want to sample 6 total and score with option 2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P1 Launch 2023 High priority issues for October 2023 AlgoPerf Launch
Projects
None yet
Development

No branches or pull requests

2 participants