Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trying to backward through the graph a second time, but the buffers have already been freed. Specify retain_graph=True when calling backward the first time. #5

Open
shaoeric opened this issue Sep 13, 2020 · 3 comments

Comments

@shaoeric
Copy link

when updating the sub network, is there any need to retain graph like
loss.backward(retain_graph=True)
because when i reproduce the procedure, the code runs wrong, but i dont know if retaining the graph is a correct operation

@chxy95
Copy link
Owner

chxy95 commented Jan 25, 2021

@shaoeric When I ran this code, this error didn't appear. But it seems like to be correct if your code does report this error. There may indeed be some bugs in this code because it is not under maintenance. The core of the code is the implementation of KL divergence loss and I verified that this part is correct.

@shaoeric
Copy link
Author

@chxy95 So sorry, i solved the problem without closing the issue. As for this problem, i have got some solutions.

  1. First, check out pytorch version, perhaps a newer version causes the confusing problem
  2. check whether the teacher model's output is detached from the graph teacher_out = teacher_out.detach()
  3. if we have more than one teacher models, putting their outputs into a list is better than a tensor, even if the tensor is detached.

@wcyjerry
Copy link

Yeah, u should use detach , because your gridient will be freed the first time you use kl and backward

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants