Skip to content

SageMaker HyperPod CLI v1.0.0

Latest
Compare
Choose a tag to compare
@jswudi jswudi released this 10 Sep 00:05
· 15 commits to main since this release
f365f57

SageMaker HyperPod CLI is a command line tool that helps create and manage training jobs on the SageMaker HyperPod clusters orchestrated by Amazon EKS.

Data scientist users can train foundational models using the EKS cluster set as the orchestrator for the SageMaker HyperPod cluster. Scientists leverage the SageMaker HyperPod CLI to find available SageMaker HyperPod clusters, submit training jobs (Pods), and manage their workloads. The SageMaker HyperPod CLI enables job submission using a training job schema file, and provides capabilities for job listing, description, cancellation, and execution. Scientists can use Kubeflow Training Operator, Kueue (K8s tool for job queuing) and SageMaker-managed MLflow to manage ML experiments and training runs.