-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix FusedAdam.zero_grad(set_to_none=True) #1579
base: master
Are you sure you want to change the base?
Conversation
apex/optimizers/fused_adam.py
Outdated
@@ -94,8 +94,8 @@ def __init__(self, params, lr=1e-3, bias_correction=True, | |||
else: | |||
raise RuntimeError('apex.optimizers.FusedAdam requires cuda extensions') | |||
|
|||
def zero_grad(self): | |||
if self.set_grad_none: | |||
def zero_grad(self, set_to_none=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
QQ: why is the default None
? Given that the default of set_grad_none
' and torch.optim.Optimizer.zero_grad
's set_to_none
is True
, it'd make sense to make it default to True
to me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also would it make sense to check that set_to_none
and self.set_grad_none
are the same? Raise a RunTimeError
if they are not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also note that this is a breaking change from torch
. It defaults to False in 1.13
and True in master
. Not sure which one would make more sense here.
def zero_grad(self, set_to_none=True): | ||
if self.set_grad_none or set_to_none: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gently pinging @crcrpar
Does this look good to you? :)
This PR adds support for set_to_none for FusedAdam.
According to pytorch/pytorch#92731 we set
set_to_none=True
by default, in accordance to current Pytorch master. see #1579 (comment)