Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

从头开始预训练 Loss下降缓慢 #426

Open
dage0127 opened this issue Nov 6, 2024 · 1 comment
Open

从头开始预训练 Loss下降缓慢 #426

dage0127 opened this issue Nov 6, 2024 · 1 comment
Labels
enhancement New feature or request

Comments

@dage0127
Copy link

dage0127 commented Nov 6, 2024

师兄,请教个问题。
采用和Qwen2一样的模型架构,调了一下参数,模型规模在1.1B左右 。8卡训练了10天,训练5000万行数据了,但现在模型训练的Loss一直在2.8左右徘徊,根据您之前训练的经验,有什么解决方案吗?

@dage0127 dage0127 added the enhancement New feature or request label Nov 6, 2024
@dage0127 dage0127 changed the title 重头开始预训练 Loss下降缓慢 从头开始预训练 Loss下降缓慢 Nov 6, 2024
@shibing624
Copy link
Owner

精简到1000条数据,训练10个epochs,看loss变化和模型效果。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants