Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Keyword save_metric_vars can trigger training_log.csv accessing errors when load_weights: True #96

Open
yingkaisha opened this issue Sep 7, 2024 · 0 comments
Assignees

Comments

@yingkaisha
Copy link
Collaborator

When load_weights: True, CREDIT models will read previous training information from training_log.csv. If the save_metric_vars option is inconsistent between previous epochs and the current epoch (e.g., adding a new variable in save_metric_vars), base_trainer.py will produce an error when reading/writing training_log.csv.

[rank1]: Traceback (most recent call last):
[rank1]:   File "/glade/u/home/ksha/miles-credit/applications/train_multistep.py", line 590, in <module>
[rank1]:     main(world_rank, world_size, conf, backend)
[rank1]:   File "/glade/u/home/ksha/miles-credit/applications/train_multistep.py", line 423, in main
[rank1]:     result = trainer.fit(
[rank1]:              ^^^^^^^^^^^^
[rank1]:   File "/glade/u/home/ksha/.local/lib/python3.11/site-packages/credit/trainers/base_trainer.py", line 491, in fit
[rank1]:     result = {k: v[best_epoch] for k, v in results_dict.items()}
[rank1]:              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]:   File "/glade/u/home/ksha/.local/lib/python3.11/site-packages/credit/trainers/base_trainer.py", line 491, in <dictcomp>
[rank1]:     result = {k: v[best_epoch] for k, v in results_dict.items()}
[rank1]:                  ~^^^^^^^^^^^^
[rank1]: IndexError: list index out of range

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants