Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Docs] Fix multi-agent tutorial #1599

Merged
merged 5 commits into from
Oct 4, 2023
Merged

Conversation

matteobettini
Copy link
Contributor

Fix the multi-agent tutorial after the recent "termianted" introduction

Signed-off-by: Matteo Bettini <[email protected]>
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 3, 2023
Signed-off-by: Matteo Bettini <[email protected]>
Signed-off-by: Matteo Bettini <[email protected]>
Signed-off-by: Matteo Bettini <[email protected]>
@matteobettini matteobettini marked this pull request as ready for review October 3, 2023 21:33
Copy link
Contributor

@vmoens vmoens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, see my comment and feel free to merge if you feel it's the right solution
I don't really have a preference

Comment on lines 637 to +650
for tensordict_data in collector:
tensordict_data.set(
("next", "done"),
("next", "agents", "done"),
tensordict_data.get(("next", "done"))
.unsqueeze(-1)
.expand(tensordict_data.get(("next", env.reward_key)).shape),
) # We need to expand the done to match the reward shape (this is expected by the value estimator)
.expand(tensordict_data.get_item_shape(("next", env.reward_key))),
)
tensordict_data.set(
("next", "agents", "terminated"),
tensordict_data.get(("next", "terminated"))
.unsqueeze(-1)
.expand(tensordict_data.get_item_shape(("next", env.reward_key))),
)
# We need to expand the done and terminated to match the reward shape (this is expected by the value estimator)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the best way to approach this?
A common criticism of our tutorials is that they are too complex, I wonder what will be judged as the most complex thing here: (1) boilerplate code such as this, which IMO has little instructive value or (2) an ad-hoc transform that may be a bit too custom to be part of the core lib

Copy link
Contributor Author

@matteobettini matteobettini Oct 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah i agree, what i can do is:

  1. leave as is
  2. make the transform in the tutorial
  3. make the transform in core
  4. We could also consider allowing rewards and dones with different shapes and expanding in the value functionals

What is your preference ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. seems like a stretch, these classes are already very complex and hard to maintain

I think hese operations are actually 2fold: (1) rename without deleting and (2) unsqueeze + expand
We have a transform that does rename, one that does unsqueeze. We just need one that does expand (or expand_as).

The quickest way forward is this:
Merge as is, open an issue with a feature request for the expand_as transform. Wdyt?

Signed-off-by: Matteo Bettini <[email protected]>
@vmoens vmoens added the documentation Improvements or additions to documentation label Oct 4, 2023
@vmoens vmoens merged commit 22fd5ba into pytorch:main Oct 4, 2023
54 of 59 checks passed
vmoens pushed a commit to hyerra/rl that referenced this pull request Oct 10, 2023
@matteobettini matteobettini deleted the tutorial branch April 4, 2024 12:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants