Enhance SAC with Mixture-of-Expert and BEE Operator for Improved Stability and Performance #788

XuGW-Kevin · 2025-01-04T07:20:21Z

This PR introduces two plug-and-play enhancements to the Soft Actor-Critic (SAC) algorithm, inspired by recent advancements in reinforcement learning research.

The integration is based on insights from several papers:
https://arxiv.org/abs/2306.02865
https://arxiv.org/abs/2402.08609
https://arxiv.org/abs/2410.14972

To maintain robustness and practicality, I excluded experimental techniques (e.g., dormant ratio, perturbing network weights, and dynamic hyperparameter tuning) that have not yet stood the test of time. The retained methods are relatively well-validated and provide substantial improvements in both stability and performance.

Key Features:
(1) Mixture-of-Expert Network
(2) Blended Exploration and Exploitation (BEE) Operator

This enhancement aims to provide a more performant SAC-based RL baseline algorithm.
Feedback and suggestions are welcome!

StoneT2000 · 2025-01-05T05:36:31Z

nice work! I'll review this next week probably (busy with icml/rss)

XuGW-Kevin · 2025-01-05T07:12:07Z

Thank you so much! Good luck with ICML/RSS!

sac_moe

8f86399

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enhance SAC with Mixture-of-Expert and BEE Operator for Improved Stability and Performance #788

Enhance SAC with Mixture-of-Expert and BEE Operator for Improved Stability and Performance #788

XuGW-Kevin commented Jan 4, 2025

StoneT2000 commented Jan 5, 2025

XuGW-Kevin commented Jan 5, 2025

Enhance SAC with Mixture-of-Expert and BEE Operator for Improved Stability and Performance #788

Are you sure you want to change the base?

Enhance SAC with Mixture-of-Expert and BEE Operator for Improved Stability and Performance #788

Conversation

XuGW-Kevin commented Jan 4, 2025

StoneT2000 commented Jan 5, 2025

XuGW-Kevin commented Jan 5, 2025