From e7b9203cedf689946ea4fe0e314aa223b8692612 Mon Sep 17 00:00:00 2001
From: Prajyot Jadhav <92448515+Arcane-01@users.noreply.github.com>
Date: Sat, 30 Dec 2023 22:05:10 +0530
Subject: [PATCH] Added notes on Learning high-speed flight in Reinforcement
 Learning

---
 reinforcement_learning/README.md | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/reinforcement_learning/README.md b/reinforcement_learning/README.md
index 0ce8e21..ddc4a80 100644
--- a/reinforcement_learning/README.md
+++ b/reinforcement_learning/README.md
@@ -2,6 +2,7 @@
 
 | Paper                                                                                                                                                                                                              | Notes                                                         | Author                                        | Summary                                                                                                                                                                                                                                 |
 |:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------:|:---------------------------------------------:|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------:|
+| [Learning high-speed flight in the wild](https://www.science.org/doi/full/10.1126/scirobotics.abg5810)                                                                                                         | [HackMD](https://hackmd.io/@Arcane-01/H1cvyQMwT)            | [Prajyot](https://github.com/Arcane-01)        |This paper presents an end-to-end approach using privileged learning to enable high-speed autonomous flight for quadrotors in complex, real-world environments by directly mapping noisy sensory observations to collision free trajectories.|
 | [DREAM TO CONTROL: LEARNING BEHAVIORS BY LATENT IMAGINATION](https://arxiv.org/pdf/1912.01603.pdf) (ICLR '20)                                                                                                         | [HackMD](https://hackmd.io/@iGBkTz2JQ2eBRM83nuhCuA/Hk9dpK0vd)            | [Raj](https://github.com/RajGhugare19)        |This paper focuses to learn long-horizon behaviors by propagating analytic value gradients through imagined trajectories using a recurrent state space model (PlaNet, haffner et al)                                                                               |
 | [The Value Equivalence Principle for Model-Based Reinforcement Learning](https://arxiv.org/abs/2011.03506) (NeurIPS '20)                                                                                                         | [HackMD](https://hackmd.io/@Raj-Ghugare/HkEY6o9MP)            | [Raj](https://github.com/RajGhugare19)        |This paper introduces and studies the concept of equivalence for Reinforcement Learning models with respect to a set of policies and value functions. It further shows that this principle can be leveraged to find models constrained by representational capacity, which are better than their maximum likelihood counterparts.                                                                                 |
 | [Stackelberg Actor-critic: A game theoretic perspective](https://hackmd.io/@FtbpSED3RQWclbmbmkChEA/rJFUQA1QO)                                                                                                      | [HackMD](https://hackmd.io/@FtbpSED3RQWclbmbmkChEA/rJFUQA1QO)            | [Sharath](https://sharathraparthy.github.io/)        | This paper formulates the interaction between the actor and critic ans a stackelberg games and leverages the implicit function theorem to calculate the accurate gradient updates for actor and critic.                                                                                                                                                        |
@@ -17,4 +18,4 @@
 |[Rainbow: Combining Improvements in Deep Reinforcement Learning](https://arxiv.org/pdf/1710.02298.pdf)|[HackMD](https://hackmd.io/@HnlvODbMQIiAlpHchdZpDQ/BkYl3IkaK)|[Om](https://github.com/DigZator)| The paper discusses add-ons to the DQN and A3C that can improve their performance, namely Double DQN, Prioritized Experience Replay, Dueling Network Architecture, Distributional Q-Learning, Noisy DQN. |
 | [The Option-Critic Architecture](https://arxiv.org/abs/1609.05140) | [HackMD](https://hackmd.io/@HnlvODbMQIiAlpHchdZpDQ/SyI7nv7_q) | [Om](https://github.com/DigZator) | Paper discusses the hierarchical reinforcement learning method implimentation based on temporal abstractions. | 
 | [Addressing Distribution Shift in Online Reinforcement Learning with Offline Datasets](https://offline-rl-neurips.github.io/pdf/13.pdf) | [HackMD](https://hackmd.io/@HnlvODbMQIiAlpHchdZpDQ/rkxHo6LL5) | [Om](https://github.com/DigZator) | The paper suggests and provides experimental justification for methods to tackle Distribution Shift. |
-| [FeUdal Networks for Hierarchical Reinforcement Learning](https://arxiv.org/abs/1703.01161) | [HackMD](https://hackmd.io/@HnlvODbMQIiAlpHchdZpDQ/HJoIiDw_c) | [Om](https://github.com/DigZator) | This paper describes the FeUdal Network model. Employs a manager-worker hierarchy. |
\ No newline at end of file
+| [FeUdal Networks for Hierarchical Reinforcement Learning](https://arxiv.org/abs/1703.01161) | [HackMD](https://hackmd.io/@HnlvODbMQIiAlpHchdZpDQ/HJoIiDw_c) | [Om](https://github.com/DigZator) | This paper describes the FeUdal Network model. Employs a manager-worker hierarchy. |