-
Notifications
You must be signed in to change notification settings - Fork 130
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DPO code #2
Comments
Hi! We refer to section 4.2 of our paper for details of DPO. We use the same codebase as ChatGLM-RLHF. We currently do not have plan to release the code and data for DPO. |
Thanks for the reply! ChatGLM-RLHF doesn't have code released. Did you modify Megatron-LM for long-context DPO, or use NeMo-Aligner, or other implementation? |
Hi, our DPO code is based on Megatron-LM. |
Hi, and thanks for sharing your work! Could you elaborate more on which specific part of Megatraon-LM (https://github.com/NVIDIA/Megatron-LM) you used? |
Any plans on releasing the DPO code, or a brief intro of how you conducted long-context DPO?
The text was updated successfully, but these errors were encountered: