Skip to content

Stable-Baselines3 v2.0.0: Gymnasium Support

Compare
Choose a tag to compare
@araffin araffin released this 23 Jun 13:00
· 81 commits to master since this release
472ff8e

Warning

Stable-Baselines3 (SB3) v2.0 will be the last one supporting python 3.7 (end of life in June 2023).
We highly recommended you to upgrade to Python >= 3.8.

SB3 Contrib (more algorithms): https://github.com/Stable-Baselines-Team/stable-baselines3-contrib
RL Zoo3 (training framework): https://github.com/DLR-RM/rl-baselines3-zoo
Stable-Baselines Jax (SBX): https://github.com/araffin/sbx

To upgrade:

pip install stable_baselines3 sb3_contrib rl_zoo3 --upgrade

or simply (rl zoo depends on SB3 and SB3 contrib):

pip install rl_zoo3 --upgrade

Breaking Changes:

  • Switched to Gymnasium as primary backend, Gym 0.21 and 0.26 are still supported via the shimmy package (@carlosluis, @arjun-kg, @tlpss)
  • The deprecated online_sampling argument of HerReplayBuffer was removed
  • Removed deprecated stack_observation_space method of StackedObservations
  • Renamed environment output observations in evaluate_policy to prevent shadowing the input observations during callbacks (@npit)
  • Upgraded wrappers and custom environment to Gymnasium
  • Refined the HumanOutputFormat file check: now it verifies if the object is an instance of io.TextIOBase instead of only checking for the presence of a write method.
  • Because of new Gym API (0.26+), the random seed passed to vec_env.seed(seed=seed) will only be effective after then env.reset() call.

New Features:

  • Added Gymnasium support (Gym 0.21 and 0.26 are supported via the shimmy package)

SB3-Contrib

  • Fixed QRDQN update interval for multi envs

RL Zoo

  • Gym 0.26+ patches to continue working with pybullet and TimeLimit wrapper
  • Renamed CarRacing-v1 to CarRacing-v2 in hyperparameters
  • Huggingface push to hub now accepts a --n-timesteps argument to adjust the length of the video
  • Fixed record_video steps (before it was stepping in a closed env)
  • Dropped Gym 0.21 support

Bug Fixes:

  • Fixed VecExtractDictObs does not handle terminal observation (@WeberSamuel)
  • Set NumPy version to >=1.20 due to use of numpy.typing (@troiganto)
  • Fixed loading DQN changes target_update_interval (@tobirohrer)
  • Fixed env checker to properly reset the env before calling step() when checking
    for Inf and NaN (@lutogniew)
  • Fixed HER truncate_last_trajectory() (@lbergmann1)
  • Fixed HER desired and achieved goal order in reward computation (@JonathanKuelz)

Others:

  • Fixed stable_baselines3/a2c/*.py type hints
  • Fixed stable_baselines3/ppo/*.py type hints
  • Fixed stable_baselines3/sac/*.py type hints
  • Fixed stable_baselines3/td3/*.py type hints
  • Fixed stable_baselines3/common/base_class.py type hints
  • Fixed stable_baselines3/common/logger.py type hints
  • Fixed stable_baselines3/common/envs/*.py type hints
  • Fixed stable_baselines3/common/vec_env/vec_monitor|vec_extract_dict_obs|util.py type hints
  • Fixed stable_baselines3/common/vec_env/base_vec_env.py type hints
  • Fixed stable_baselines3/common/vec_env/vec_frame_stack.py type hints
  • Fixed stable_baselines3/common/vec_env/dummy_vec_env.py type hints
  • Fixed stable_baselines3/common/vec_env/subproc_vec_env.py type hints
  • Upgraded docker images to use mamba/micromamba and CUDA 11.7
  • Updated env checker to reflect what subset of Gymnasium is supported and improve GoalEnv checks
  • Improve type annotation of wrappers
  • Tests envs are now checked too
  • Added render test for VecEnv and VecEnvWrapper
  • Update issue templates and env info saved with the model
  • Changed seed() method return type from List to Sequence
  • Updated env checker doc and requirements for tuple spaces/goal envs

Documentation:

  • Added Deep RL Course link to the Deep RL Resources page
  • Added documentation about VecEnv API vs Gym API
  • Upgraded tutorials to Gymnasium API
  • Make it more explicit when using VecEnv vs Gym env
  • Added UAV_Navigation_DRL_AirSim to the project page (@heleidsn)
  • Added EvalCallback example (@sidney-tio)
  • Update custom env documentation
  • Added pink-noise-rl to projects page
  • Fix custom policy example, ortho_init was ignored
  • Added SBX page

Full Changelog: v1.8.0...v2.0.0