Implimentation of TD3 and SAC algortihms #10

MickyasTA · 2024-07-16T20:04:36Z

Hello,

I have implemented the TD3 and SAC algorithms based on your simulator. You can find them, along with the implementation of PPO done by you, in the repository. It would be great if you could review the code and correct any mistakes you find. Currently, both algorithms are running, but since I introduced a replay buffer, the simulation has become slow. To mitigate this, I reduced the number of environments to 50 and set the replay buffer size to 1e4.

It would be greatly appreciated if you could correct any bugs and help build the repository.

Thank you!

It is good but at this stage but the replay buffer have some unpacking problem that will be solved in the next commit.

…raining the environment.

MickyasTA · 2024-07-30T09:06:13Z

Hello what is the progress on my request ,I see that you modified the repository bit.

mihirk284 · 2024-07-31T11:49:29Z

Hello,

Thank you for opening this pull request with new algorithms. I missed this PR before I made the update to the codebase and upgrading the simulator. Perhaps I can try in the coming weeks to check the contribution and see if it can be made compatible with the updated version.

On checking the files, I see that there are some other files from your local installation that have been included such as the vscode config file and the PDFs. These may not be immediately relevant in the repository and can be removed. Similarly the configuration options to enable wandb logging by default may also not be needed by end-users by default. Can you please push to address these?

I can check the code related to the new RL algorithms and get back to you in some time.

MickyasTA · 2024-08-09T16:17:00Z

Hello,

I wanted to inform you that I have made the necessary modifications based on your feedback and have pushed the updated version. Please take a moment to review it at your convenience to ensure it aligns with your expectations.

with kind regards!
Mickyas Tamiru Asfaw

MickyasTA added 5 commits May 28, 2024 02:14

FIRST COMMIT Correcting Bugs on the gymisac import

5cd1a49

Implementqation of td3 and sac algorithm

036965e

This is the revised implimentation of the SAC algorithm

641a9c8

It is good but at this stage but the replay buffer have some unpacking problem that will be solved in the next commit.

Now the SAC algorithm is running well and it is working

b9557cf

The td3 implimentation is done as a first time but it is so slow in t…

0bf666f

…raining the environment.

Little modifications to be compatable with the pull request.

105b6bf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implimentation of TD3 and SAC algortihms #10

Implimentation of TD3 and SAC algortihms #10

MickyasTA commented Jul 16, 2024

MickyasTA commented Jul 30, 2024

mihirk284 commented Jul 31, 2024

MickyasTA commented Aug 9, 2024

Implimentation of TD3 and SAC algortihms #10

Are you sure you want to change the base?

Implimentation of TD3 and SAC algortihms #10

Conversation

MickyasTA commented Jul 16, 2024

MickyasTA commented Jul 30, 2024

mihirk284 commented Jul 31, 2024

MickyasTA commented Aug 9, 2024