Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fp8 all gather hack #1136

Open
wants to merge 3 commits into
base: ngoyal_added_zero2_shard_modelparams_multiple_gpus
Choose a base branch
from

Commits on Sep 19, 2023

  1. fp8 allgather

    jspark1105 committed Sep 19, 2023
    Configuration menu
    Copy the full SHA
    0224797 View commit details
    Browse the repository at this point in the history

Commits on Sep 22, 2023

  1. don't shard norm weights

    jspark1105 committed Sep 22, 2023
    Configuration menu
    Copy the full SHA
    db6a1c7 View commit details
    Browse the repository at this point in the history

Commits on Oct 15, 2023

  1. use main_grad for higher precision gradient accumulation; update amax…

    … during post_backward_hook
    jspark1105 committed Oct 15, 2023
    Configuration menu
    Copy the full SHA
    c0f4b97 View commit details
    Browse the repository at this point in the history