Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add optimizer configuration to eval #125

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

ohmeow
Copy link
Collaborator

@ohmeow ohmeow commented Oct 18, 2024

Changes

Instead of hardcoded optimizers using in task specific train/eval runs, run_evals_from_checkpoints.py will now create a default optimizer configuration as well as default task specific overrides to be used.

Discussions

Discussed with @warner-benjamin so as to bring the finetuning configs more in line with the pretraining

Tests
No tests added

  • Is the new feature tested? (Not always necessary for all changes -- just adding to the checklist to keep track)
  • Have you ran all the tests?
  • Do the tests all pass?
  • If not, have you included an explanation of which tests this PR breaks and/or why (below this checklisT)

ablation_eval.py Outdated Show resolved Hide resolved
@ohmeow
Copy link
Collaborator Author

ohmeow commented Oct 23, 2024

Ok I think this is good to go ...

Copy link
Contributor

@warner-benjamin warner-benjamin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's one error which needs to be fixed, and am curious to hear if you have any thoughts on potentially avoiding the second set of hardcoded variables,

Otherwise looks good.

Comment on lines +271 to +277
mlmmlu_amateur_semipro["optimizer"] = OrderedDict([
('name', 'decoupled_adamw'),
('lr', 2.0e-5),
('betas', [0.9, 0.98]),
('eps', 1e-06),
('weight_decay', 5.0e-06),
])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that adding a second set of hardcoded defaults is the best idea, but I'm not sure there's an easy way to support a dynamic input in typer,

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One option is to remove the hardcoded optimizer defaults from the ClassificationJob classes. I think we want this driven in the config for two reasons:

  1. It's clear to the user how the optimizer is setup without them having to go look in the the classes for each task.
  2. It will make logging these things in wandb a bit easier (not 100% on this)

ablation_eval.py Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants