Practical_RL/week09_policy_II at master · yandexdataschool/Practical_RL

Name		Name	Last commit message	Last commit date
parent directory ..
td3_and_sac		td3_and_sac
test_ppo		test_ppo
README.md		README.md
mujoco_wrappers.py		mujoco_wrappers.py
ppo.ipynb		ppo.ipynb
runners.py		runners.py
seminar_TRPO_pytorch.ipynb		seminar_TRPO_pytorch.ipynb
seminar_TRPO_tensorflow.ipynb		seminar_TRPO_tensorflow.ipynb

README.md

This section covers some steroids for policy gradient methods, along with a cool general trick called

While you already know algorithms that will work with continuously many actions, it can't hurt to learn something more specialized.