Add optimizer configuration to eval #125

ohmeow · 2024-10-18T19:32:37Z

Changes

Instead of hardcoded optimizers using in task specific train/eval runs, run_evals_from_checkpoints.py will now create a default optimizer configuration as well as default task specific overrides to be used.

Discussions

Discussed with @warner-benjamin so as to bring the finetuning configs more in line with the pretraining

Tests
No tests added

Is the new feature tested? (Not always necessary for all changes -- just adding to the checklist to keep track)
Have you ran all the tests?
Do the tests all pass?
If not, have you included an explanation of which tests this PR breaks and/or why (below this checklisT)

… to use it

ablation_eval.py

Co-authored-by: Benjamin Warner <[email protected]>

ohmeow · 2024-10-23T15:52:40Z

Ok I think this is good to go ...

warner-benjamin

There's one error which needs to be fixed, and am curious to hear if you have any thoughts on potentially avoiding the second set of hardcoded variables,

Otherwise looks good.

warner-benjamin · 2024-10-23T17:55:20Z

generate_eval_config_from_checkpoint.py

+        mlmmlu_amateur_semipro["optimizer"] = OrderedDict([
+            ('name', 'decoupled_adamw'),
+            ('lr', 2.0e-5),
+            ('betas', [0.9, 0.98]),
+            ('eps', 1e-06),
+            ('weight_decay', 5.0e-06),
+        ])


I'm not sure that adding a second set of hardcoded defaults is the best idea, but I'm not sure there's an easy way to support a dynamic input in typer,

One option is to remove the hardcoded optimizer defaults from the ClassificationJob classes. I think we want this driven in the config for two reasons:

It's clear to the user how the optimizer is setup without them having to go look in the the classes for each task.

It will make logging these things in wandb a bit easier (not 100% on this)

ablation_eval.py

… config

ohmeow added 2 commits October 18, 2024 10:28

added default optimizer and mnli overrides to eval yaml; updated mnli…

af92598

… to use it

added ability to pass optimizer to ClassificationJob; tested with mnli

06b62f8

ohmeow requested a review from warner-benjamin October 18, 2024 19:32

merged from main and fixed conflicts

0e7efd8

warner-benjamin reviewed Oct 22, 2024

View reviewed changes

ablation_eval.py Outdated Show resolved Hide resolved

ohmeow and others added 3 commits October 22, 2024 12:25

updated glue_jobs.py to use optimizer config

63a92e3

Update ablation_eval.py

7cf1fa4

Co-authored-by: Benjamin Warner <[email protected]>

finalized optimizer config across all tasks

6f001d4

ohmeow requested a review from warner-benjamin October 23, 2024 15:52

warner-benjamin requested changes Oct 23, 2024

View reviewed changes

updated smoketest yamls to include the new optimizer bits in the main…

22a8518

… config

ohmeow requested a review from warner-benjamin October 25, 2024 17:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add optimizer configuration to eval #125

Add optimizer configuration to eval #125

ohmeow commented Oct 18, 2024

ohmeow commented Oct 23, 2024

warner-benjamin left a comment

warner-benjamin Oct 23, 2024

ohmeow Oct 25, 2024

Add optimizer configuration to eval #125

Are you sure you want to change the base?

Add optimizer configuration to eval #125

Conversation

ohmeow commented Oct 18, 2024

ohmeow commented Oct 23, 2024

warner-benjamin left a comment

Choose a reason for hiding this comment

warner-benjamin Oct 23, 2024

Choose a reason for hiding this comment

ohmeow Oct 25, 2024

Choose a reason for hiding this comment