-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add optimizer configuration to eval #125
base: main
Are you sure you want to change the base?
Conversation
Co-authored-by: Benjamin Warner <[email protected]>
Ok I think this is good to go ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's one error which needs to be fixed, and am curious to hear if you have any thoughts on potentially avoiding the second set of hardcoded variables,
Otherwise looks good.
mlmmlu_amateur_semipro["optimizer"] = OrderedDict([ | ||
('name', 'decoupled_adamw'), | ||
('lr', 2.0e-5), | ||
('betas', [0.9, 0.98]), | ||
('eps', 1e-06), | ||
('weight_decay', 5.0e-06), | ||
]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure that adding a second set of hardcoded defaults is the best idea, but I'm not sure there's an easy way to support a dynamic input in typer,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One option is to remove the hardcoded optimizer defaults from the ClassificationJob
classes. I think we want this driven in the config for two reasons:
- It's clear to the user how the optimizer is setup without them having to go look in the the classes for each task.
- It will make logging these things in wandb a bit easier (not 100% on this)
Changes
Instead of hardcoded optimizers using in task specific train/eval runs,
run_evals_from_checkpoints.py
will now create a default optimizer configuration as well as default task specific overrides to be used.Discussions
Discussed with @warner-benjamin so as to bring the finetuning configs more in line with the pretraining
Tests
No tests added