I added HaluEval to lm-eval-harness, can you please double check? #18

pminervini · 2023-12-07T15:11:43Z

Here is my pull request: EleutherAI/lm-evaluation-harness#1076

Thanks!

Xiaoxue-xx · 2024-02-12T03:51:04Z

Sorry for the late reply. We have double-checked the PR and think there is no problem with this implementation.

pminervini · 2024-02-12T08:05:56Z

@Xiaoxue-xx thank you! 🙂 You can find the latest version of the tasks here: https://huggingface.co/spaces/hallucinations-leaderboard/leaderboard/tree/main/src/backend/tasks/halueval
We used it for our Hallucinations Leaderboard: https://huggingface.co/blog/leaderboards-on-the-hub-hallucinations

@Xiaoxue-xx, would you also be happy if we switch from an open-ended generation task to a multiple-choice task, where the model is only allowed to generate "yes" or "no"?

Xiaoxue-xx · 2024-02-14T06:40:52Z

Sure. Thank you for your attention to HaluEval！

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

I added HaluEval to lm-eval-harness, can you please double check? #18

I added HaluEval to lm-eval-harness, can you please double check? #18

pminervini commented Dec 7, 2023

Xiaoxue-xx commented Feb 12, 2024

pminervini commented Feb 12, 2024

Xiaoxue-xx commented Feb 14, 2024

I added HaluEval to lm-eval-harness, can you please double check? #18

I added HaluEval to lm-eval-harness, can you please double check? #18

Comments

pminervini commented Dec 7, 2023

Xiaoxue-xx commented Feb 12, 2024

pminervini commented Feb 12, 2024

Xiaoxue-xx commented Feb 14, 2024