Major changes:
Spinning Up now has PyTorch implementations of VPG, PPO, DDPG, TD3, and SAC, in addition to the old Tensorflow versions.
Examples and exercises have been updated to include PyTorch versions as well.
The reward shift bug in the Tensorflow versions of VPG, TRPO, and PPO has been fixed.
DDPG, TD3, and SAC Tensorflow versions were modified so that they now update every N steps instead of at the end of each trajectory. The PyTorch versions of these algorithms have the same behavior.
Spinning Up's SAC has been updated to reflect the more-modern version of SAC that does not use a V-function. The tutorial page on SAC has been updated to describe the new version of SAC.
The benchmark page has been updated with reruns for all algorithms on all environments, using the latest version of the code.