Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Adam optimizer #115

Closed
milancurcic opened this issue Jan 17, 2023 · 6 comments
Closed

Implement Adam optimizer #115

milancurcic opened this issue Jan 17, 2023 · 6 comments
Labels
enhancement New feature or request

Comments

@milancurcic
Copy link
Member

Proposed by @rweed in #114.

Paper: https://arxiv.org/abs/1412.6980

Currently, the optimizers module is only a stub and the only available optimizer (SGD) is hardcoded in the network % train method, with updating of weights progataing all the way down to individual concrete layer implementations. Some refactoring is needed to decouple the weight updates from concrete layer implementations and to allow defining optimizer algorithms in their own concrete types.

@milancurcic milancurcic added the enhancement New feature or request label Jan 17, 2023
@rweed
Copy link
Contributor

rweed commented Jan 18, 2023

Milan

FYI, I found one Fortran implementation of Adam at
https://github.com/thchang/NN_MOD
Unfortunately, the comments in the code appear to suggest it was implemented
but never tested. Still looking for a batch normalization implementation (outside of Keras)
Also, one of my other "wants" , linear layers is trivial to implement. Took me all of about 5 minutes to do that in your existing code.

@milancurcic
Copy link
Member Author

Thanks for the link to NN_MOD. I'd like to work on Adam first. I think it's easier to implement than batch norm, and it will drive the much needed refactor for optimizers in general (rather than them being hardcorded in the network % train subroutine).

Would you like to contribute the linear layer here as a PR? As I understand it, it's just a dense layer but without an activation. Are you just using a dense layer but with a "no-op" activation function (i.e. y = x)?

@rweed
Copy link
Contributor

rweed commented Jan 19, 2023 via email

@milancurcic
Copy link
Member Author

@Spnetic-5, would you like to tackle this one next? I forgot whether you have a WIP implementation of Adam or AdaGrad?

@Spnetic-5
Copy link
Collaborator

Yes, Adam optimizer implementation is under progress; I'll make a PR soon.

@milancurcic
Copy link
Member Author

Done by #150.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants