- Slides
- Video lecture by D. Silver - video
- Our lecture, seminar(pytorch)
- Alternative lecture by J. Schulman part 1 - video
- Alternative lecture by J. Schulman part 2 - video
- Andrej Karpathy's post on policy gradients
-
Actually proving the policy gradient for discounted rewards - article
-
On variance of policy gradient and optimal baselines: article, another article
-
Learn Advatangeg Actor Critic with a comic
-
Generalizing log-derivative trick - url
-
Combining policy gradient and q-learning - arxiv
-
Variational perspective on reinforcement learning (from DeepBayes) - pdf
-
Adversarial review of policy gradient - blog