GitHub - bnurbekov/DeepRL-A3C-LSTM: Work in progress (unfinished) implementation of A3C

Dependencies

Vizdoom: https://github.com/Marqt/ViZDoom.

Environment

I was not able to make ppaquette_gym_doom work on my machine. Thus, in the interests of time I implemented lightweight gym-like interface for Vizdoom (which is slightly different from OpenAI Gym's interface).

Model

A2C (Advantage Actor - Critic) + LSTM.

Main point of reference: https://arxiv.org/pdf/1609.05521v1.pdf.

Further improvements

Implement unsupervised auxilary tasks to mitigate the reward sparseness in the environment (https://arxiv.org/abs/1611.05397).
Apply generalized advantage estimation.
Try out TRPO and Natural Gradient techniques.
Learning from demonstration for pre-training.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
assets		assets
README.md		README.md
a3c.py		a3c.py
env_experiments.py		env_experiments.py
env_testing.py		env_testing.py
environment.py		environment.py
main.py		main.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Dependencies

Environment

Model

Further improvements

About

Releases

Packages

Languages

bnurbekov/DeepRL-A3C-LSTM

Folders and files

Latest commit

History

Repository files navigation

Dependencies

Environment

Model

Further improvements

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages