Validation for Pendulum trained model #262

lesonglam · 2025-01-24T23:52:08Z

lesonglam
Jan 24, 2025

Hi all,

I just follow the example from document website (https://skrl.readthedocs.io/en/develop/_downloads/118e328d79ea6d72a3c818e23e77e7ee/torch_gymnasium_pendulum_ppo.py)

I run the above file (torch_gymnasium_pendulum_ppo.py) with 100,000 epochs. and I write scripts for testing. The top of the testing script is the same as the training path. The bottom part of the script for testing is as below.

agent = PPO(models=models,
memory=memory,
cfg=cfg,
observation_space=env.observation_space,
action_space=env.action_space,
device=device)

Load the checkpoint

agent.load("./runs/torch/Pendulum/25-01-24_17-11-38-827106_PPO/checkpoints/best_agent.pt")

model = models["policy"].to(device=device)

Create the environment

env = gym.make('Pendulum-v1', render_mode='human')
observations, info = env.reset()

Simulation loop

num_steps = 100000 # Total number of steps for the simulation

for step in range(num_steps):
# Prepare the input for the model
inputs = {"states": torch.from_numpy( observations ).to(device=device)}
# Get actions from the model
actions = model( inputs)[0].cpu().detach().numpy()
observations, rewards, dones, infos, _ = env.step(actions)
print("Rewards:", rewards)

# # If all environments are done, reset them
if dones :
    observations, info = env.reset()

Close the environment when done

env.close()

=============================================
The result shows that the reward never reaches 0 as anticipated. With 100_000 epochs, can you tell me what happens to this while the Agents never get to the goal?

Thank you,

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Validation for Pendulum trained model #262

{{title}}

Replies: 0 comments

Select a reply

Validation for Pendulum trained model #262

lesonglam Jan 24, 2025

Load the checkpoint

Create the environment

Simulation loop

Close the environment when done

Replies: 0 comments

lesonglam
Jan 24, 2025