Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

关于奖励模型训练数据的构成 #409

Open
Eren139 opened this issue Aug 26, 2024 · 5 comments
Open

关于奖励模型训练数据的构成 #409

Eren139 opened this issue Aug 26, 2024 · 5 comments
Labels
question Further information is requested

Comments

@Eren139
Copy link

Eren139 commented Aug 26, 2024

我想请问一下,该项目在训练医学奖励模型的时候,是只用到了医学领域的偏好数据集吗?有没有和通用领域的偏好进行混合训练?我只用医学偏好数据集训练奖励模型会有严重的过拟合。

@Eren139 Eren139 added the question Further information is requested label Aug 26, 2024
@shibing624
Copy link
Owner

需要混合。

@Eren139
Copy link
Author

Eren139 commented Aug 26, 2024

需要混合。

非常感谢您的回答,我可以再问您一下奖励模型数据的混合比例吗,大概通用数据集占多少,医疗数据集占多少?

@shibing624
Copy link
Owner

10:1,通用10

@litsh
Copy link

litsh commented Sep 3, 2024

您好,请问可以和您交流一下奖励模型的训练吗?方便的话可以留一下联系方式。

@Eren139
Copy link
Author

Eren139 commented Sep 6, 2024

您好,请问可以和您交流一下奖励模型的训练吗?方便的话可以留一下联系方式。

你好,我最近才开始了解,也不是很熟悉,如果需要的话可以加我微信:Eren_139

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants