Release v1.44.1: Support for a3-ultragpu-8g VMs and GKE, Slurm clusters #3478
tpdownes
announced in
Announcements
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Release notes v1.44.1
This release announces Toolkit support for the new A3 Ultra machine type from Google Cloud. This machine type includes 8 NVIDIA H200 GPUs each with dedicated CX-7 networking with RDMA support via RoCE.
The release includes 4 blueprints that maximize performance for the machine type:
Example solutions using NCCL are provided for blueprints running under a scheduler.
This discussion was created from the release Release v1.44.1: Support for a3-ultragpu-8g VMs and GKE, Slurm clusters.
Beta Was this translation helpful? Give feedback.
All reactions