Skip to content

Latest commit

 

History

History

week06_policy_based

Materials

More materials

  • Actually proving the policy gradient for discounted rewards - article

  • On variance of policy gradient and optimal baselines: article, another article

  • Learn Advatangeg Actor Critic with a comic

  • Generalizing log-derivative trick - url

  • Combining policy gradient and q-learning - arxiv

  • Variational perspective on reinforcement learning (from DeepBayes) - pdf

  • Adversarial review of policy gradient - blog

Run seminar notebook in Colab: Open In Colab

Run optional homework notebook in Colab: Open In Colab