You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
glm2_infer.py内容如下
'''
import paddle
from paddlenlp.transformers import AutoModelForCausalLM, AutoTokenizer
from paddle.distributed import fleet
import time
运行日志信息如下
'''
[2024-11-05 06:10:16,275] [ INFO] topology.py:370 - Total 2 sharding comm group(s) create successfully!
I1105 06:10:16.275830 31043 process_group_nccl.cc:150] ProcessGroupNCCL pg_timeout_ 1800000
I1105 06:10:16.275833 31043 process_group_nccl.cc:151] ProcessGroupNCCL nccl_comm_init_option_ 0
I1105 06:10:16.275861 31043 process_group_nccl.cc:150] ProcessGroupNCCL pg_timeout_ 1800000
I1105 06:10:16.275863 31043 process_group_nccl.cc:151] ProcessGroupNCCL nccl_comm_init_option_ 0
[2024-11-05 06:10:16,275] [ INFO] topology.py:290 - HybridParallelInfo: rank_id: 0, mp_degree: 2, sharding_degree: 1, pp_degree: 1, dp_degree: 1, sep_degree: 1, mp_group: [0, 1], sharding_group: [0], pp_group: [0], dp_group: [0], sep:group: None, check/clip group: [0, 1]
[2024-11-05 06:10:16,276] [ INFO] - We are using <class 'paddlenlp.transformers.chatglm_v2.modeling.ChatGLMv2ForCausalLM'> to load '/code/pretrain/pd/THUDM/chatglm2-6b'.
[2024-11-05 06:10:16,277] [ INFO] - Loading configuration file /code/pretrain/pd/THUDM/chatglm2-6b/config.json
[2024-11-05 06:10:16,278] [ INFO] - Loading weights file /code/pretrain/pd/THUDM/chatglm2-6b/model_state.pdparams
[2024-11-05 06:11:34,779] [ INFO] - Starting to convert orignal state_dict to tensor parallel state_dict.
[2024-11-05 06:11:54,819] [ INFO] - Loaded weights file from disk, setting weights to model.
[2024-11-05 06:12:21,429] [ INFO] - All model checkpoint weights were used when initializing ChatGLMv2ForCausalLM.
[2024-11-05 06:12:21,429] [ INFO] - All the weights of ChatGLMv2ForCausalLM were initialized from the model checkpoint at /code/pretrain/pd/THUDM/chatglm2-6b.
If your task is similar to the task the model of the checkpoint was trained on, you can already use ChatGLMv2ForCausalLM for predictions without further training.
[2024-11-05 06:12:21,494] [ INFO] - Generation config file not found, using a generation config created from the model config.
LAUNCH INFO 2024-11-05 06:12:23,289 Pod failed
LAUNCH ERROR 2024-11-05 06:12:23,289 Container failed !!!
Container rank 0 status failed cmd ['/usr/bin/python', '-u', 'glm2_infer.py'] code -7 log log/workerlog.0
LAUNCH INFO 2024-11-05 06:12:23,289 ------------------------- ERROR LOG DETAIL -------------------------
:151] ProcessGroupNCCL nccl_comm_init_option_ 0
[2024-11-05 06:10:14,520] [ INFO] topology.py:370 - Total 2 pipe comm group(s) create successfully!
W1105 06:10:14.534988 31043 gpu_resources.cc:119] Please NOTE: device: 0, GPU Compute Capability: 8.9, Driver API Version: 12.0, Runtime API Version: 11.8
W1105 06:10:14.536003 31043 gpu_resources.cc:164] device: 0, cuDNN Version: 8.9.
/usr/local/lib/python3.10/dist-packages/paddle/distributed/communication/group.py:128: UserWarning: Current global rank 0 is not in group default_pg10
warnings.warn(
[2024-11-05 06:10:16,275] [ INFO] topology.py:370 - Total 2 data comm group(s) create successfully!
I1105 06:10:16.275671 31043 process_group_nccl.cc:150] ProcessGroupNCCL pg_timeout 1800000
I1105 06:10:16.275683 31043 process_group_nccl.cc:151] ProcessGroupNCCL nccl_comm_init_option_ 0
[2024-11-05 06:10:16,275] [ INFO] topology.py:370 - Total 1 model comm group(s) create successfully!
[2024-11-05 06:10:16,275] [ INFO] topology.py:370 - Total 2 sharding comm group(s) create successfully!
I1105 06:10:16.275830 31043 process_group_nccl.cc:150] ProcessGroupNCCL pg_timeout_ 1800000
I1105 06:10:16.275833 31043 process_group_nccl.cc:151] ProcessGroupNCCL nccl_comm_init_option_ 0
I1105 06:10:16.275861 31043 process_group_nccl.cc:150] ProcessGroupNCCL pg_timeout_ 1800000
I1105 06:10:16.275863 31043 process_group_nccl.cc:151] ProcessGroupNCCL nccl_comm_init_option_ 0
[2024-11-05 06:10:16,275] [ INFO] topology.py:290 - HybridParallelInfo: rank_id: 0, mp_degree: 2, sharding_degree: 1, pp_degree: 1, dp_degree: 1, sep_degree: 1, mp_group: [0, 1], sharding_group: [0], pp_group: [0], dp_group: [0], sep:group: None, check/clip group: [0, 1]
[2024-11-05 06:10:16,276] [ INFO] - We are using <class 'paddlenlp.transformers.chatglm_v2.modeling.ChatGLMv2ForCausalLM'> to load '/code/pretrain/pd/THUDM/chatglm2-6b'.
[2024-11-05 06:10:16,277] [ INFO] - Loading configuration file /code/pretrain/pd/THUDM/chatglm2-6b/config.json
[2024-11-05 06:10:16,278] [ INFO] - Loading weights file /code/pretrain/pd/THUDM/chatglm2-6b/model_state.pdparams
[2024-11-05 06:11:34,779] [ INFO] - Starting to convert orignal state_dict to tensor parallel state_dict.
[2024-11-05 06:11:54,819] [ INFO] - Loaded weights file from disk, setting weights to model.
[2024-11-05 06:12:21,429] [ INFO] - All model checkpoint weights were used when initializing ChatGLMv2ForCausalLM.
[2024-11-05 06:12:21,429] [ INFO] - All the weights of ChatGLMv2ForCausalLM were initialized from the model checkpoint at /code/pretrain/pd/THUDM/chatglm2-6b.
If your task is similar to the task the model of the checkpoint was trained on, you can already use ChatGLMv2ForCausalLM for predictions without further training.
[2024-11-05 06:12:21,494] [ INFO] - Generation config file not found, using a generation config created from the model config.
LAUNCH INFO 2024-11-05 06:12:23,290 Exit code -7
'''
The text was updated successfully, but these errors were encountered:
问题
chatglm2-6b用paddlenlp的预训练权重在多卡上推理,对显卡或环境有什么具体要求吗?这边在4090显卡跑了下但出错了,可以麻烦帮忙看下是什么问题导致的吗,如环境还是多卡推理代码,多谢
运行环境:4090显卡 + cuda11.8 + paddlepaddle-gpu==3.0.0b1,
+ PaddleNLP: 3.0.0b2.post20241105(branch:dev_20240926_update_chatglmv2)
执行命令:python -u -m paddle.distributed.launch --gpus "0,1" glm2_infer.py
glm2_infer.py内容如下
'''
import paddle
from paddlenlp.transformers import AutoModelForCausalLM, AutoTokenizer
from paddle.distributed import fleet
import time
tokenizer = AutoTokenizer.from_pretrained('/code/pretrain/pd/THUDM/chatglm2-6b')
tensor_parallel_degree = paddle.distributed.get_world_size()
print(f"***tensor_parallel_degree={tensor_parallel_degree}")
tensor_parallel_rank = 0
if tensor_parallel_degree > 1:
strategy = fleet.DistributedStrategy()
strategy.hybrid_configs = {
"dp_degree": 1,
"mp_degree": tensor_parallel_degree,
"pp_degree": 1,
"sharding_degree": 1,
}
fleet.init(is_collective=True, strategy=strategy)
hcg = fleet.get_hybrid_communicate_group()
tensor_parallel_rank = hcg.get_model_parallel_rank()
model = AutoModelForCausalLM.from_pretrained("/code/pretrain/pd/THUDM/chatglm2-6b",
tensor_parallel_degree=tensor_parallel_degree,
tensor_parallel_rank=tensor_parallel_rank, dtype='bfloat16')
question = '世界上第二高的山峰是哪座?'
query = f"[Round 0]\n\n问:{question}\n\n答:"
input_ids = tokenizer(query, return_tensors='pd')
#print(model(**input_ids))
b_time = time.time()
output = model.generate(**input_ids, decode_strategy='greedy_search', max_new_tokens=150)
out = tokenizer.decode(output[0][0])
print("*" * 45)
print(f"HUMAN:{question}\nAI:{out}\ncost_time:{time.time()-b_time}")
b_time = time.time()
output = model.generate(**input_ids, decode_strategy='greedy_search', max_new_tokens=150)
out = tokenizer.decode(output[0][0])
print("*" * 45)
print(f"HUMAN:{question}\nAI:{out}\ncost_time:{time.time()-b_time}")
'''
运行日志信息如下
'''
[2024-11-05 06:10:16,275] [ INFO] topology.py:370 - Total 2 sharding comm group(s) create successfully!
I1105 06:10:16.275830 31043 process_group_nccl.cc:150] ProcessGroupNCCL pg_timeout_ 1800000
I1105 06:10:16.275833 31043 process_group_nccl.cc:151] ProcessGroupNCCL nccl_comm_init_option_ 0
I1105 06:10:16.275861 31043 process_group_nccl.cc:150] ProcessGroupNCCL pg_timeout_ 1800000
I1105 06:10:16.275863 31043 process_group_nccl.cc:151] ProcessGroupNCCL nccl_comm_init_option_ 0
[2024-11-05 06:10:16,275] [ INFO] topology.py:290 - HybridParallelInfo: rank_id: 0, mp_degree: 2, sharding_degree: 1, pp_degree: 1, dp_degree: 1, sep_degree: 1, mp_group: [0, 1], sharding_group: [0], pp_group: [0], dp_group: [0], sep:group: None, check/clip group: [0, 1]
[2024-11-05 06:10:16,276] [ INFO] - We are using <class 'paddlenlp.transformers.chatglm_v2.modeling.ChatGLMv2ForCausalLM'> to load '/code/pretrain/pd/THUDM/chatglm2-6b'.
[2024-11-05 06:10:16,277] [ INFO] - Loading configuration file /code/pretrain/pd/THUDM/chatglm2-6b/config.json
[2024-11-05 06:10:16,278] [ INFO] - Loading weights file /code/pretrain/pd/THUDM/chatglm2-6b/model_state.pdparams
[2024-11-05 06:11:34,779] [ INFO] - Starting to convert orignal state_dict to tensor parallel state_dict.
[2024-11-05 06:11:54,819] [ INFO] - Loaded weights file from disk, setting weights to model.
[2024-11-05 06:12:21,429] [ INFO] - All model checkpoint weights were used when initializing ChatGLMv2ForCausalLM.
[2024-11-05 06:12:21,429] [ INFO] - All the weights of ChatGLMv2ForCausalLM were initialized from the model checkpoint at /code/pretrain/pd/THUDM/chatglm2-6b.
If your task is similar to the task the model of the checkpoint was trained on, you can already use ChatGLMv2ForCausalLM for predictions without further training.
[2024-11-05 06:12:21,494] [ INFO] - Generation config file not found, using a generation config created from the model config.
LAUNCH INFO 2024-11-05 06:12:23,289 Pod failed
LAUNCH ERROR 2024-11-05 06:12:23,289 Container failed !!!
Container rank 0 status failed cmd ['/usr/bin/python', '-u', 'glm2_infer.py'] code -7 log log/workerlog.0
LAUNCH INFO 2024-11-05 06:12:23,289 ------------------------- ERROR LOG DETAIL -------------------------
:151] ProcessGroupNCCL nccl_comm_init_option_ 0
[2024-11-05 06:10:14,520] [ INFO] topology.py:370 - Total 2 pipe comm group(s) create successfully!
W1105 06:10:14.534988 31043 gpu_resources.cc:119] Please NOTE: device: 0, GPU Compute Capability: 8.9, Driver API Version: 12.0, Runtime API Version: 11.8
W1105 06:10:14.536003 31043 gpu_resources.cc:164] device: 0, cuDNN Version: 8.9.
/usr/local/lib/python3.10/dist-packages/paddle/distributed/communication/group.py:128: UserWarning: Current global rank 0 is not in group default_pg10
warnings.warn(
[2024-11-05 06:10:16,275] [ INFO] topology.py:370 - Total 2 data comm group(s) create successfully!
I1105 06:10:16.275671 31043 process_group_nccl.cc:150] ProcessGroupNCCL pg_timeout 1800000
I1105 06:10:16.275683 31043 process_group_nccl.cc:151] ProcessGroupNCCL nccl_comm_init_option_ 0
[2024-11-05 06:10:16,275] [ INFO] topology.py:370 - Total 1 model comm group(s) create successfully!
[2024-11-05 06:10:16,275] [ INFO] topology.py:370 - Total 2 sharding comm group(s) create successfully!
I1105 06:10:16.275830 31043 process_group_nccl.cc:150] ProcessGroupNCCL pg_timeout_ 1800000
I1105 06:10:16.275833 31043 process_group_nccl.cc:151] ProcessGroupNCCL nccl_comm_init_option_ 0
I1105 06:10:16.275861 31043 process_group_nccl.cc:150] ProcessGroupNCCL pg_timeout_ 1800000
I1105 06:10:16.275863 31043 process_group_nccl.cc:151] ProcessGroupNCCL nccl_comm_init_option_ 0
[2024-11-05 06:10:16,275] [ INFO] topology.py:290 - HybridParallelInfo: rank_id: 0, mp_degree: 2, sharding_degree: 1, pp_degree: 1, dp_degree: 1, sep_degree: 1, mp_group: [0, 1], sharding_group: [0], pp_group: [0], dp_group: [0], sep:group: None, check/clip group: [0, 1]
[2024-11-05 06:10:16,276] [ INFO] - We are using <class 'paddlenlp.transformers.chatglm_v2.modeling.ChatGLMv2ForCausalLM'> to load '/code/pretrain/pd/THUDM/chatglm2-6b'.
[2024-11-05 06:10:16,277] [ INFO] - Loading configuration file /code/pretrain/pd/THUDM/chatglm2-6b/config.json
[2024-11-05 06:10:16,278] [ INFO] - Loading weights file /code/pretrain/pd/THUDM/chatglm2-6b/model_state.pdparams
[2024-11-05 06:11:34,779] [ INFO] - Starting to convert orignal state_dict to tensor parallel state_dict.
[2024-11-05 06:11:54,819] [ INFO] - Loaded weights file from disk, setting weights to model.
[2024-11-05 06:12:21,429] [ INFO] - All model checkpoint weights were used when initializing ChatGLMv2ForCausalLM.
[2024-11-05 06:12:21,429] [ INFO] - All the weights of ChatGLMv2ForCausalLM were initialized from the model checkpoint at /code/pretrain/pd/THUDM/chatglm2-6b.
If your task is similar to the task the model of the checkpoint was trained on, you can already use ChatGLMv2ForCausalLM for predictions without further training.
[2024-11-05 06:12:21,494] [ INFO] - Generation config file not found, using a generation config created from the model config.
LAUNCH INFO 2024-11-05 06:12:23,290 Exit code -7
'''
The text was updated successfully, but these errors were encountered: