Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix 1D PP tracer test #362

Merged
merged 10 commits into from
Jun 11, 2024
Merged

Fix 1D PP tracer test #362

merged 10 commits into from
Jun 11, 2024

Conversation

wconstab
Copy link
Contributor

@wconstab wconstab commented May 24, 2024

Stack from ghstack (oldest at bottom):

forgot to enable tracer for tracer test in the last PR

[ghstack-poisoned]
wconstab added a commit that referenced this pull request May 24, 2024
forgot to enable tracer for tracer test in the last PR

ghstack-source-id: 4a8ad78dd63a3c6cda80e34c6f1ad0d6cb955f9d
Pull Request resolved: #362
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label May 24, 2024
forgot to enable tracer for tracer test in the last PR

[ghstack-poisoned]
kwen2501 added a commit that referenced this pull request May 29, 2024
forgot to enable tracer for tracer test in the last PR

ghstack-source-id: 6eff83d7fe5af576dc6da0dcdae5bc51b4ac8ec4
Pull Request resolved: #362
kwen2501 added a commit that referenced this pull request May 29, 2024
Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at
bottom):
* #362
* __->__ #371

This PR fixes the issue mentioned
[here](pytorch/pytorch#126653 (comment)):
"Module object has no attributed items."

The reason is, a split `ModuleDict` is no longer a `ModuleDict`.

It would be more generally applicable if we use `named_children()` and
`register_module()` to access and update submodules.
forgot to enable tracer for tracer test in the last PR

[ghstack-poisoned]
kwen2501 added a commit that referenced this pull request Jun 1, 2024
forgot to enable tracer for tracer test in the last PR

ghstack-source-id: 94c84cf90aa77f7620b32988e389dcb05a8098f3
Pull Request resolved: #362
Copy link
Contributor

@kwen2501 kwen2501 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made a small change:
In tracer mode, we don't require users to provide manual split points, because that is being taken care of by the pipeline_llama_tracer function today:

layers_per_rank = len(model.layers) // parallel_dims.pp
split_spec = {
f"layers.{i * layers_per_rank}": SplitPoint.BEGINNING
for i in range(1, parallel_dims.pp)
}

@kwen2501
Copy link
Contributor

kwen2501 commented Jun 1, 2024

CI should pass after pytorch/pytorch#127607 is landed.

forgot to enable tracer for tracer test in the last PR

[ghstack-poisoned]
kwen2501 added a commit that referenced this pull request Jun 1, 2024
forgot to enable tracer for tracer test in the last PR

ghstack-source-id: 57c680cb3615acba902984ba969a021f02477d38
Pull Request resolved: #362
@kwen2501 kwen2501 changed the title Fix 1D PP tracer test, add 2D test Fix 1D PP tracer test Jun 1, 2024
forgot to enable tracer for tracer test in the last PR

[ghstack-poisoned]
kwen2501 added a commit that referenced this pull request Jun 3, 2024
Separate the addition of 2D test from original PR #362 for easier review and landing.

Also changed `.items()` to `named_children()` for more general submodule access. See similar changes in #371. 

[ghstack-poisoned]
kwen2501 added a commit that referenced this pull request Jun 3, 2024
Separate the addition of 2D test from original PR #362 for easier review and landing.

Also changed `.items()` to `named_children()` for more general submodule access. See similar changes in #371. 

[ghstack-poisoned]
@wconstab
Copy link
Contributor Author

wconstab commented Jun 7, 2024

Made a small change:
In tracer mode, we don't require users to provide manual split points, because that is being taken care of by the pipeline_llama_tracer function today:

I think this is not the right way to do this. If we want the layer-split to be 'automatic', i think it should be automatic for both frontends, and we can delete the _split_points cmdline arg.

or if we want to have the cmdline arg, we should keep it behaving the same for both frontends.

I'd propose to first keep the arg for both frontends, and then do a PR that makes the cmdline arg optional and uses automation for both PRs.

@kwen2501
Copy link
Contributor

kwen2501 commented Jun 7, 2024

Sounds good to me

[ghstack-poisoned]
wconstab pushed a commit that referenced this pull request Jun 10, 2024
forgot to enable tracer for tracer test in the last PR

ghstack-source-id: 63f458eeb0e225e6dd43fd53249c37a5099245a8
Pull Request resolved: #362
[ghstack-poisoned]
wconstab pushed a commit that referenced this pull request Jun 10, 2024
forgot to enable tracer for tracer test in the last PR

ghstack-source-id: 64664d32e2a7414cd0133dd7082740304d06a347
Pull Request resolved: #362
[ghstack-poisoned]
wconstab pushed a commit that referenced this pull request Jun 10, 2024
forgot to enable tracer for tracer test in the last PR

ghstack-source-id: fc9da54f5c105cc14927db85e97ab8c9de6b04e4
Pull Request resolved: #362
[ghstack-poisoned]
wconstab pushed a commit that referenced this pull request Jun 10, 2024
forgot to enable tracer for tracer test in the last PR

ghstack-source-id: f31ecca90bab6745e7fc2802c56b5c9796a857ae
Pull Request resolved: #362
[ghstack-poisoned]
wconstab pushed a commit that referenced this pull request Jun 10, 2024
forgot to enable tracer for tracer test in the last PR

ghstack-source-id: 1cb137911f88daa47b57757346dad55aa736429e
Pull Request resolved: #362
Copy link
Contributor

@kwen2501 kwen2501 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-approve

@wconstab wconstab merged commit 56496bb into gh/wconstab/30/base Jun 11, 2024
5 checks passed
wconstab pushed a commit that referenced this pull request Jun 11, 2024
forgot to enable tracer for tracer test in the last PR

ghstack-source-id: 1cb137911f88daa47b57757346dad55aa736429e
Pull Request resolved: #362
@wconstab wconstab deleted the gh/wconstab/30/head branch June 11, 2024 16:53
tianyu-l pushed a commit to tianyu-l/torchtitan_intern24 that referenced this pull request Aug 16, 2024
forgot to enable tracer for tracer test in the last PR

ghstack-source-id: 1cb137911f88daa47b57757346dad55aa736429e
Pull Request resolved: pytorch#362
tianyu-l pushed a commit that referenced this pull request Aug 16, 2024
Separate the addition of 2D test from original PR #362 for easier review and landing.

Also changed `.items()` to `named_children()` for more general submodule access. See similar changes in #371. 

[ghstack-poisoned]
tianyu-l pushed a commit that referenced this pull request Aug 16, 2024
forgot to enable tracer for tracer test in the last PR

ghstack-source-id: 57c680cb3615acba902984ba969a021f02477d38
Pull Request resolved: #362
philippguevorguian pushed a commit to YerevaNN/YNNtitan that referenced this pull request Aug 17, 2024
forgot to enable tracer for tracer test in the last PR

ghstack-source-id: 1cb137911f88daa47b57757346dad55aa736429e
Pull Request resolved: pytorch#362
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Meta Open Source bot.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants