Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expert vs Ours? #2

Open
1826949 opened this issue Feb 28, 2021 · 2 comments
Open

Expert vs Ours? #2

1826949 opened this issue Feb 28, 2021 · 2 comments

Comments

@1826949
Copy link

1826949 commented Feb 28, 2021

Hi Vincent,

In tensorboard, under the "Images" tab, we are able to visualize our results on the Ant or the Humanoid. We see a split screen (e.g. 2 ants), one on the left and one on the right.

How do we know which is the expert and which is ours (DAgger)?

(PS: Where were you able to find this info? I tried googling for hours and could not find it.)

Thanks!

@vincentkslim
Copy link
Owner

It's been a while since I've seen it but I don't think the videos show the expert, instead they only show the policy trained through DAgger. I think one is a training rollout while another is an evaluation rollout. You can see what they log in the code here, and it looks like one is labeled train_rollouts and the other is labeled eval_rollouts. Both are necessary because if the imitation learning agent reaches a state that isn't in the expert training data, then it tends to have a much higher chance to fail.

It's a good idea to look through other parts of the code and understand it because if you end up ever applying what you learn here in another area, you will likely need to use similar infrastructure to what they have provided for you here (e.g. replay buffers, logging, etc.) in addition to implementing the algorithms themselves (I usually find RL algorithms themselves to be quite short in terms of lines of code; infrastructure/data pipelines end up needing many more lines). In addition, they sometimes contain decent starter hyperparameters that you can use on your own projects.

Hope this helps.

Vincent

@vincentkslim
Copy link
Owner

I've taken a look at the tensorboard and I see what you are talking about now. Unfortunately, I'm not sure but I'll take a look at it and I'll let you know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants