Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spiking loss #36

Open
davipatti opened this issue Jan 12, 2025 · 2 comments
Open

Spiking loss #36

davipatti opened this issue Jan 12, 2025 · 2 comments

Comments

@davipatti
Copy link

Hi @theosanderson

Firstly thanks for developing chronumental, it's been really helpful!

I've noticed that the loss and error metrics spike quite noticeably during runs:

image

I don't know enough about the optimizer that is being used to know whether this is expected behaviour, but it does seem a little odd.

I guess you could get unlucky if the run finished during a spike. In this run it looks like the final parameters are actually just at the end of a spike and not properly back on the plateau, which I assume would be preferable.

Anyway, I wondered whether this is something to be concerned about, whether its pointing to me doing something wrong, and/or whether it might point you to something that could be improved.

Happy to share more details and the data.

Many thanks
David

@theosanderson
Copy link
Owner

Hi David,

A couple of things:

On this very specific question:

In this run it looks like the final parameters are actually just at the end of a spike and not properly back on the plateau, which I assume would be preferable.

My belief is that this should not be an issue.

image

Unless this is set we should use the params that gave the lowest loss. (Unless there's a bug of which I'm not aware).

More generically though, I have a fair bit of paranoia about whether I may have broken some bits of Chronumental with some changes in the last couple of years and I have some overdue benchmarking to perform. It is certainly good to be cautious about its behaviour.

All the best

Theo

@davipatti
Copy link
Author

Thanks for the response.

It's reassuring that by default the params with the lowest loss (and not the final set) get used.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants