You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
What's the issue, what's expected?:
Error when using ms-amp to do llm sft.
ms-amp deepspeed config:
"msamp": {
"enabled": true,
"opt_level": "O1|O2|O3", # all tried
"use_te": false
}
How to reproduce it?:
Follow the setup of DeepSpeed-Chat, and do some small code modify to enable ms-amp in DeepSpeed-Chat/training/step1_supervised_finetuning/main.py:
line 20 modify: import deepspeed -> from msamp import deepspeed
@LSC527
FP8 accelerates the training significantly when the model is relatively large (> 6B parameters). MS-AMP can reduce the memory usage to enable a larger batch size. And it can be cooperated with TransformerEngine to improve the FP8 training speed .
What's the issue, what's expected?:
Error when using ms-amp to do llm sft.
ms-amp deepspeed config:
"msamp": {
"enabled": true,
"opt_level": "O1|O2|O3", # all tried
"use_te": false
}
How to reproduce it?:
Follow the setup of DeepSpeed-Chat, and do some small code modify to enable ms-amp in DeepSpeed-Chat/training/step1_supervised_finetuning/main.py:
line 20 modify: import deepspeed -> from msamp import deepspeed
line 230 add:
ds_config["msamp"] = {
"enabled": True,
"opt_level": "O1|O2|O3",
"use_te": False
}
Log message or shapshot?:
Additional information:
env: ghcr.io/azure/msamp:v0.4.0-cuda12.2
gpu: h100 * 8
The text was updated successfully, but these errors were encountered: