This section covers some steroids for policy gradient methods, along with a cool general trick called
- Lecture on NPG and TRPO by J. Schulman - video
- Alternative lecture on TRPO and open problems by... J. Schulman - video
- Our videos: lecture, seminar(PyTorch) (russian)
- Original articles - TRPO, NPG
While you already know algorithms that will work with continuously many actions, it can't hurt to learn something more specialized.