Reinforcement learning (RL) algorithms have been used to discover policies that maximize reward. They have been heavily used in robotics as a way to mimic the human brain such that the agent can learn to find the best possible set of actions based on the given state it is in. However, the current reinforcement learning algorithms do not gurantee safety during the learning or execution phases. For example, training an autonomous drone to avoid obstacles would possible involve the physical collision between the drone and the object in order to receive a negative reward for it's action. We can already observe that such process could lead to damaging the onboard components, which is not sustainable and economically efficient for the researcher.
To help improve this situation, I am researching RL algorithms that gurantee safety (i.e. Safety Reinforcement Learning) for the agent. I am working as a Machine Learning Researcher for the Resilient Cyber-Physical Systems (RCPS) under Yasser Shoukry. Since Summer of 2020, I have been understanding RL algorithms with an emphasis on policy gradient methods with added constraints. My responsibility is to apply policy gradient algorithms (e.g. Proximal Policy Optimization) to prove safety gurantee for the agent during the training and execution phase. The main tool I am using to help faciliate the research process is OpenAI's Gym and Safety Gym simulator.
Aside from my research goal, I will be sharing resources and paper notes that could possibly help students, hobbyists, and researcher develop a better understanding of Safe Reinforcement Learning.
Lil'Log is a great resource for understanding RL. The first article helped me comprehend with the key concepts and terminology in RL, and the common approaches taken to solving problems in RL. The second article summarizes the different policy gradient methods in RL, which is essential for my research.
Another great resource is Spinning Up in Deep RL, which was developed by OpenAI. Provides an introduction to RL as well as other valuable resources such as key papers in RL. OpenAI was nice enough to provide extra sources on getting the right background (e.g. math, deep learning, terminology), and how you can learn by doing.
To understand how simulators work in general, Tucker Mclure provides a great explanation on simulators and how we can develop quality simulations.
Other Resources
Books
Sources on Policy Gradient methods
Deep Learning
Concepts on Loss Functions and Entropy
Online Journal and Research Databases
- arxiv sanity preserver [Github] - Developed by Andrej Karpathy. A web interface that allows researchers to keep track of relevant papers and store papers in their own personal library.
Before installing CUDA and other packages, we need to install development tools, image and video I/O libraries, GUI packages, optimization libraries, and other packages.
First, update your system:
$ sudo apt-get update
$ sudo apt-get upgrade
Then, install the tools, libraries, and other packages:
$ sudo apt-get install build-essential cmake unzip pkg-config
$ sudo apt-get install libxmu-dev libxi-dev libglu1-mesa libglu1-mesa-dev
$ sudo apt-get install libjpeg-dev libpng-dev libtiff-dev
$ sudo apt-get install libavcodec-dev libavformat-dev libswscale-dev libv4l-dev
$ sudo apt-get install libxvidcore-dev libx264-dev
$ sudo apt-get install libgtk-3-dev
$ sudo apt-get install libopenblas-dev libatlas-base-dev liblapack-dev gfortran
$ sudo apt-get install libhdf5-serial-dev
$ sudo apt-get install python3-dev python3-tk python-imaging-tk
I've come across many tutorials for installing CUDA and CUDANN for Ubuntu 18.04. The tutorial I followed for installing CUDA 10.1 is from Tensorflow, and their procedure for installing Cuda, CUDNN, and TensorRT is great. Here is the link: https://www.tensorflow.org/install/gpu.
For Ubuntu 20.04, I simply installed CUDA Toolkit through their documentation. CUDA Toolkit 11.4 Download. I installed CUDA Toolkit through deb(local). Make sure to install cuDNN runtime and developer library. You can download the libraries through here (download them in your Downloads folder): https://developer.nvidia.com/rdp/cudnn-download.
$ cd <path/to/Downloads>
$ sudo dpkg -i libcudnn8_x.x.x-1+cudax.x_amd64.deb // Installs the runtime library. x.x.x is the version
$ sudo dpkg -i libcudnn8-dev_8.x.x.x-1+cudax.x_amd64.deb // Installs the developer library. x.x.x is the version
After installing CUDA and cudaNN, paste the following text to your ~/.bashrc file:
export PATH=/usr/local/cuda-<VERSION>/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda-<VERSION>/lib64
Replace with the CUDA version you installed. For me, it would be 11.0. Tensorflow has this nice piece of code to check whether Tensorflow is using the GPU or not. First, reboot your system and run the code below in your terminal:
tf.config.list_physical_devices('GPU')
Python virtual environments are great for creating an isolated environment for Python projects. If you have two separate Python projects that require the same Python library, but different versions, then creating an isolated environment would solve this issue without having to change the version in the system.
First, install pip:
$ wget https://bootstrap.pypa.io/get-pip.py
$ sudo python3 get-pip.py
Install virtualenv and virtualenvwrapper:
$ sudo pip install virtualenv virtualenvwrapper
$ sudo rm -rf ~/get-pip.py ~/.cache/pip
Copy and paste the following text below to the ~/.bashrc file. You can open the ~/.bashrc file with your favorite text editor. In here, I use vim:
$ sudo vim ~/.bashrc
#virtualenv and virtualenvwrapper
export WORKON_HOME=$HOME/.virtualenvs
export VIRTUALENVWRAPPER_PYTHON=/usr/bin/python3
source /usr/local/bin/virtualenvwrapper.sh
Then, source your ~/.bashrc:
$ source ~/.bashrc
Safety Gym both run under the MuJoCo physics engine, which helps facilitate research and development in robotics and other areas that require reachers to scale up computationally intensive techniques. MuJoCo is free (woot woot). OpenAI recently passed on Gym to a new maintainer. The maintainer is planneing to substitute MuJoCo with PyBullet. See this post for more information.
Website: MuJoCo
Source: mujoco-py
$ git clone https://github.com/openai/mujoco-py.git
$ cd mujoco-py
$ python3 setup.py install
Before importing mujoco_py, make sure to past the text below to your ~/.bashrc file:
export LD_LIBRARY_PATH=$HOME/.mujoco/mujoco200/bin
Source: Safety Gym
$ git clone https://github.com/openai/safety-gym.git
$ cd safety-gym
$ pip install -e .
Website To install CoppeliaSim, downloaded the software from this source and extract the files to your desired folder. Find the folder containing CoppeliaSim and run the .sh file. In Linux:
$ cd <path/to/CoppeliaSim_Edu_V4_2_0_Ubuntu20_04>
$ ./coppeliaSim.sh
To install robosuite, I folowed the documenation from the research group's docs. It would be best to install from source. https://robosuite.ai/docs/installation.html
When you run a demo or project with robosuite, you will get the following error:
ERROR: GLEW initalization error: Missing GL version
Before running any project or demos with robosuite, you must export the text below to your ~/.bashrc file:
export LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libGLEW.so
2021-02
- Proximal Policy Optimization Algorithms[Notes]
- ADAM: A METHOD FOR STOCHASTIC OPTIMIZATION
- Revisiting Design Choices in Proximal Policy Optimization
- Policy Gradient Methods for Reinforcement Learning with Function Approximation
- Benchmarking Deep Reinforcement Learning for Continuous Control
- High-Dimensional Continuous Control Using Generalized Advantage Estimation
- Trust Region Policy Optimization
- Safe Reinforcement Learning via Shielding
- ART: Abstraction Refinement-Guided Training for Provably Correct Neural Networks
- Uncertainty-Aware Reinforcement Learning for Collision Avoidance
- Safety-Guided Deep Reinforcement Learning via Online Gaussian Process Estimation
- Reinforcement Learning in Robotics: A Survey