Skip to content

Safe Reinforce Learning -> Constraint algorithms to train agents in Safety Gym, paper notes on research papers regarding RL with constraints + optimizer + neural networks, PyTorch implementation on policy gradient algorithms

Notifications You must be signed in to change notification settings

Nathan-Bernardo/RCPS-Safety-Guidance

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

87 Commits
 
 
 
 
 
 

Repository files navigation

Introduction

Reinforcement learning (RL) algorithms have been used to discover policies that maximize reward. They have been heavily used in robotics as a way to mimic the human brain such that the agent can learn to find the best possible set of actions based on the given state it is in. However, the current reinforcement learning algorithms do not gurantee safety during the learning or execution phases. For example, training an autonomous drone to avoid obstacles would possible involve the physical collision between the drone and the object in order to receive a negative reward for it's action. We can already observe that such process could lead to damaging the onboard components, which is not sustainable and economically efficient for the researcher.

To help improve this situation, I am researching RL algorithms that gurantee safety (i.e. Safety Reinforcement Learning) for the agent. I am working as a Machine Learning Researcher for the Resilient Cyber-Physical Systems (RCPS) under Yasser Shoukry. Since Summer of 2020, I have been understanding RL algorithms with an emphasis on policy gradient methods with added constraints. My responsibility is to apply policy gradient algorithms (e.g. Proximal Policy Optimization) to prove safety gurantee for the agent during the training and execution phase. The main tool I am using to help faciliate the research process is OpenAI's Gym and Safety Gym simulator.

Aside from my research goal, I will be sharing resources and paper notes that could possibly help students, hobbyists, and researcher develop a better understanding of Safe Reinforcement Learning.

Resources

Lil'Log is a great resource for understanding RL. The first article helped me comprehend with the key concepts and terminology in RL, and the common approaches taken to solving problems in RL. The second article summarizes the different policy gradient methods in RL, which is essential for my research.

Another great resource is Spinning Up in Deep RL, which was developed by OpenAI. Provides an introduction to RL as well as other valuable resources such as key papers in RL. OpenAI was nice enough to provide extra sources on getting the right background (e.g. math, deep learning, terminology), and how you can learn by doing.

To understand how simulators work in general, Tucker Mclure provides a great explanation on simulators and how we can develop quality simulations.

Other Resources

Books
Sources on Policy Gradient methods
Deep Learning
Concepts on Loss Functions and Entropy
Online Journal and Research Databases
  • arxiv sanity preserver [Github] - Developed by Andrej Karpathy. A web interface that allows researchers to keep track of relevant papers and store papers in their own personal library.

Getting Set Up

Installing Ubuntu dependencies

Before installing CUDA and other packages, we need to install development tools, image and video I/O libraries, GUI packages, optimization libraries, and other packages.

First, update your system:

$ sudo apt-get update 
$ sudo apt-get upgrade

Then, install the tools, libraries, and other packages:

$ sudo apt-get install build-essential cmake unzip pkg-config
$ sudo apt-get install libxmu-dev libxi-dev libglu1-mesa libglu1-mesa-dev
$ sudo apt-get install libjpeg-dev libpng-dev libtiff-dev
$ sudo apt-get install libavcodec-dev libavformat-dev libswscale-dev libv4l-dev
$ sudo apt-get install libxvidcore-dev libx264-dev
$ sudo apt-get install libgtk-3-dev
$ sudo apt-get install libopenblas-dev libatlas-base-dev liblapack-dev gfortran
$ sudo apt-get install libhdf5-serial-dev
$ sudo apt-get install python3-dev python3-tk python-imaging-tk

Installing Cuda

I've come across many tutorials for installing CUDA and CUDANN for Ubuntu 18.04. The tutorial I followed for installing CUDA 10.1 is from Tensorflow, and their procedure for installing Cuda, CUDNN, and TensorRT is great. Here is the link: https://www.tensorflow.org/install/gpu.

For Ubuntu 20.04, I simply installed CUDA Toolkit through their documentation. CUDA Toolkit 11.4 Download. I installed CUDA Toolkit through deb(local). Make sure to install cuDNN runtime and developer library. You can download the libraries through here (download them in your Downloads folder): https://developer.nvidia.com/rdp/cudnn-download.

$ cd <path/to/Downloads>
$ sudo dpkg -i libcudnn8_x.x.x-1+cudax.x_amd64.deb // Installs the runtime library. x.x.x is the version
$ sudo dpkg -i libcudnn8-dev_8.x.x.x-1+cudax.x_amd64.deb // Installs the developer library. x.x.x is the version

After installing CUDA and cudaNN, paste the following text to your ~/.bashrc file:

export PATH=/usr/local/cuda-<VERSION>/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda-<VERSION>/lib64 

Replace with the CUDA version you installed. For me, it would be 11.0. Tensorflow has this nice piece of code to check whether Tensorflow is using the GPU or not. First, reboot your system and run the code below in your terminal:

tf.config.list_physical_devices('GPU')

Installing Virtual Environment

Python virtual environments are great for creating an isolated environment for Python projects. If you have two separate Python projects that require the same Python library, but different versions, then creating an isolated environment would solve this issue without having to change the version in the system.

First, install pip:

$ wget https://bootstrap.pypa.io/get-pip.py
$ sudo python3 get-pip.py

Install virtualenv and virtualenvwrapper:

$ sudo pip install virtualenv virtualenvwrapper
$ sudo rm -rf ~/get-pip.py ~/.cache/pip

Copy and paste the following text below to the ~/.bashrc file. You can open the ~/.bashrc file with your favorite text editor. In here, I use vim:

$ sudo vim ~/.bashrc

#virtualenv and virtualenvwrapper 
export WORKON_HOME=$HOME/.virtualenvs
export VIRTUALENVWRAPPER_PYTHON=/usr/bin/python3
source /usr/local/bin/virtualenvwrapper.sh

Then, source your ~/.bashrc:
$ source ~/.bashrc

Installing Safety Gym

Safety Gym both run under the MuJoCo physics engine, which helps facilitate research and development in robotics and other areas that require reachers to scale up computationally intensive techniques. MuJoCo is free (woot woot). OpenAI recently passed on Gym to a new maintainer. The maintainer is planneing to substitute MuJoCo with PyBullet. See this post for more information.

Website: MuJoCo
Source: mujoco-py

Installing mujoco_py

$ git clone https://github.com/openai/mujoco-py.git
$ cd mujoco-py
$ python3 setup.py install

Before importing mujoco_py, make sure to past the text below to your ~/.bashrc file:
export LD_LIBRARY_PATH=$HOME/.mujoco/mujoco200/bin

Installing Safety Gym

Source: Safety Gym

$ git clone https://github.com/openai/safety-gym.git
$ cd safety-gym
$ pip install -e .

Other Simulators

CoppeliaSim

Website To install CoppeliaSim, downloaded the software from this source and extract the files to your desired folder. Find the folder containing CoppeliaSim and run the .sh file. In Linux:

$ cd <path/to/CoppeliaSim_Edu_V4_2_0_Ubuntu20_04>
$ ./coppeliaSim.sh

Installing robosuite from source

To install robosuite, I folowed the documenation from the research group's docs. It would be best to install from source. https://robosuite.ai/docs/installation.html

When you run a demo or project with robosuite, you will get the following error:
ERROR: GLEW initalization error: Missing GL version

Before running any project or demos with robosuite, you must export the text below to your ~/.bashrc file:
export LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libGLEW.so

Paper Notes

2021-03
2021-02

About

Safe Reinforce Learning -> Constraint algorithms to train agents in Safety Gym, paper notes on research papers regarding RL with constraints + optimizer + neural networks, PyTorch implementation on policy gradient algorithms

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages