You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
at line 31 of elastic_weight_consolidation.py it calculates mean of log_likelihoods so grad_log_liklihood will contain mean of gradients of log_likelihoods and then at line 35 it squares this mean of gradients of log_likelihoods.
this is WRONG because diagonal element of Fisher matrix is sum of squared gradients of log_liklihoods but not squared sum of gradients of log_liklihoods.
so for each input the separate gradient of log_likelihood must be calculated, then each gradient must be squared and then mean of these squares must be calculated/
The text was updated successfully, but these errors were encountered:
aakutalev
changed the title
this ewc implementation CODE has ERROR which prevent ewc to work properly
this ewc implementation CODE has theoretical ERROR which prevent ewc to work properly
Mar 13, 2021
at line 31 of elastic_weight_consolidation.py it calculates mean of log_likelihoods so grad_log_liklihood will contain mean of gradients of log_likelihoods and then at line 35 it squares this mean of gradients of log_likelihoods. this is WRONG because diagonal element of Fisher matrix is sum of squared gradients of log_liklihoods but not squared sum of gradients of log_liklihoods. so for each input the separate gradient of log_likelihood must be calculated, then each gradient must be squared and then mean of these squares must be calculated/
totally agree. But i think the difference between these 2 is minor
at line 31 of elastic_weight_consolidation.py it calculates mean of log_likelihoods so grad_log_liklihood will contain mean of gradients of log_likelihoods and then at line 35 it squares this mean of gradients of log_likelihoods.
this is WRONG because diagonal element of Fisher matrix is sum of squared gradients of log_liklihoods but not squared sum of gradients of log_liklihoods.
so for each input the separate gradient of log_likelihood must be calculated, then each gradient must be squared and then mean of these squares must be calculated/
The text was updated successfully, but these errors were encountered: