This repository contains the implementation of a Proximal Policy Optimization (PPO) model for solving the LunarLander-v2 environment in OpenAI Gym using the Stable Baselines3 library.
Proximal Policy Optimization (PPO) is a reinforcement learning algorithm used for training agents to perform tasks in various environments. In this project, we apply PPO to solve the LunarLander-v2 environment provided by OpenAI Gym. The goal of the LunarLander-v2 task is to land a spaceship safely on the moon's surface while optimizing for fuel consumption.
This repository includes the code for training the PPO agent using the Stable Baselines3 library to learn how to control the spaceship's actions and land safely on the lunar surface.
The LunarLander-v2 environment is a part of the OpenAI Gym library. It simulates the landing of a lunar module on the moon's surface, providing observations as sensor readings and requiring actions to control the module's engines. The agent must learn to navigate and land safely by optimizing its policy.
To run the code in this repository, you need to have the following dependencies installed:
- Python (>=3.6)
- Gym
- Stable Baselines3
- NumPy
- Other dependencies (if specified in the code)
You can install the required Python packages using pip:
pip install gym stable-baselines3 numpy