All-In-One Adam Optimizer in PyTorch

All-In-One Adam Optimizer where several novelties are combined from following papers:

Decoupled Weight Decay Regularization for Adam https://arxiv.org/abs/1711.05101

Authors shown that the real reason why Momentum optimizer is often outperforming Adam in generalization was due to the fact that Adam does not perform well under L2 regularization and developed decoupled weight decay as a solution.

Online Learning Rate Adaptation with Hypergradient Descent https://arxiv.org/abs/1703.04782

This is enabled via "hypergrad" parameter by setting it to any value except zero. It enables the optimizer to update the learning-rate itself by the technique proposed in the paper, instead of giving an external schedule which would require lots of additional hyperparameters. It is especially useful when one doesn't have the chance to hypertune a schedule.

Closing the Generalization Gap of Adaptive Gradient Methods in Training Deep Neural Networks https://arxiv.org/abs/1806.06763

This can be set by the "partial" parameter, which controls how likely the optimizer acts similar to Adam (1.0) and SGD (0.0), which is very useful if hypertuned. One can also update (decay) this parameter online to switch between Adam and SGD optimizers in an easy way, which has been recommended by previous research for a better generalization.

AdaBound with Decoupled Weight Decay

Adaptive Gradient Methods with Dynamic Bound of Learning Rate https://github.com/Luolc/AdaBound

Exploiting Uncertainty of Loss Landscape for Stochastic Optimization https://github.com/bsvineethiitg/adams

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
AdamAIO.py		AdamAIO.py
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

All-In-One Adam Optimizer in PyTorch

AdaBound with Decoupled Weight Decay

About

Releases

Packages

Languages

License

kayuksel/pytorch-adamaio

Folders and files

Latest commit

History

Repository files navigation

All-In-One Adam Optimizer in PyTorch

AdaBound with Decoupled Weight Decay

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages