generated from fastai/nbdev_template
-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Issues: huggingface/trl
[Tracking issue] Integrate native liger-kernel losses
#2495
opened Dec 17, 2024 by
qgallouedec
Open
2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
[question] best way to have my own reward model which is backed by rules
🏋 PPO
Related to PPO
❓ question
Seeking clarification or more information
#2518
opened Dec 24, 2024 by
yananchen1989
Soft Actor-Critic (SAC) Trainer
✨ enhancement
New feature or request
#2517
opened Dec 23, 2024 by
AMindToThink
3 tasks
RLOO trainer epochs/steps/episodes calculations seems not to be working properly
🐛 bug
Something isn't working
🏋 RLOO
Related to RLOO
#2515
opened Dec 23, 2024 by
dawidm
7 of 9 tasks
Checkpointing is failing with SFTTrainer PEFT LoRA on DeepSpeed Zero-3
🐛 bug
Something isn't working
⚡ PEFT
Related to PEFT
🏋 SFT
Related to SFT
#2514
opened Dec 21, 2024 by
SwayamInSync
7 of 9 tasks
Absence of ref_model_name in the file which located in docs/source/best_of_n.mdx
#2508
opened Dec 20, 2024 by
aivolcano
7 of 9 tasks
DDPO checkpoint ú·
🐛 bug
Something isn't working
🏋 DPPO
Related to DDPO
🙋 help from community wanted
Open invitation for community members to contribute
⏳ needs more info
Additional information or clarification is required to proceed
#2505
opened Dec 20, 2024 by
nguyenhoa-uit
5 of 9 tasks
Spectrum training support
✨ enhancement
New feature or request
🏋 SFT
Related to SFT
#2504
opened Dec 19, 2024 by
ggbetz
[bug] objective/entropy < 0 when using rlootrainer and ppotrainer
🙋 help from community wanted
Open invitation for community members to contribute
🏋 PPO
Related to PPO
❓ question
Seeking clarification or more information
🏋 RLOO
Related to RLOO
#2496
opened Dec 17, 2024 by
macheng6
[Tracking issue] Integrate native liger-kernel losses
✨ enhancement
New feature or request
🧒 good second issue
Good for contributors with basic project familiarity
#2495
opened Dec 17, 2024 by
qgallouedec
5 tasks
DeepSpeed with trl
🐛 bug
Something isn't working
🚀 deepspeed
Related to deepspeed
🏋 DPO
Related to DPO
⏳ needs more info
Additional information or clarification is required to proceed
#2490
opened Dec 16, 2024 by
sagie-dekel
7 of 9 tasks
Main documentation instruction image needs a description
#2489
opened Dec 16, 2024 by
Kallinteris-Andreas
RewardConfig
's max_length
argument docstring should indicate that it filters out dataset, rather than truncating it
📚 documentation
#2488
opened Dec 16, 2024 by
Kallinteris-Andreas
UserWarning for train dpo with lora: None of the inputs have requires_grad=True. Gradients will be None
#2486
opened Dec 16, 2024 by
xkw666
7 of 9 tasks
Trainer forces the use of a specific collator
🏋 GKD
Related to GKD
❓ question
Seeking clarification or more information
#2481
opened Dec 14, 2024 by
hteague-qti
KeyError in DPO Trainer, evaluation_loop
🐛 bug
Something isn't working
🏋 DPO
Related to DPO
#2473
opened Dec 13, 2024 by
qingjianbuyi
7 of 9 tasks
A question about rlootrainer
🙋 help from community wanted
Open invitation for community members to contribute
❓ question
Seeking clarification or more information
🏋 RLOO
Related to RLOO
#2472
opened Dec 13, 2024 by
macheng6
1 of 3 tasks
Provide Descriptions (READMEs) for Related to data
📚 documentation
Improvements or additions to documentation
✨ enhancement
New feature or request
👶 good first issue
Good for newcomers
🙋 help from community wanted
Open invitation for community members to contribute
trl-lib/dataset
🗃️ data
#2470
opened Dec 13, 2024 by
Kallinteris-Andreas
Packing in DPOTrainer
🏋 DPO
Related to DPO
✨ enhancement
New feature or request
#2469
opened Dec 13, 2024 by
zhc7
Probably a more reasonable method of New feature or request
🧒 good second issue
Good for contributors with basic project familiarity
🙋 help from community wanted
Open invitation for community members to contribute
🏋 SFT
Related to SFT
packing
✨ enhancement
#2466
opened Dec 12, 2024 by
AIR-hl
Why isn't Soft-Actor Critic (SAC) Available for RLHF?
❓ question
Seeking clarification or more information
#2465
opened Dec 11, 2024 by
AMindToThink
3 tasks
Evaluation with Something isn't working
🏋 Online DPO
Related to Online DPO
OnlineDPO
🐛 bug
#2464
opened Dec 11, 2024 by
MohamedAliRashad
7 of 9 tasks
Add the possibility to skip prepare_model_for_kbit_training
🏋 DPO
Related to DPO
✨ enhancement
New feature or request
#2459
opened Dec 10, 2024 by
hugoabonizio
Previous Next
ProTip!
Type g i on any issue or pull request to go back to the issue listing page.