Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

I added HaluEval to lm-eval-harness, can you please double check? #18

Open
pminervini opened this issue Dec 7, 2023 · 3 comments
Open

Comments

@pminervini
Copy link
Contributor

Here is my pull request: EleutherAI/lm-evaluation-harness#1076

Thanks!

@Xiaoxue-xx
Copy link

Sorry for the late reply. We have double-checked the PR and think there is no problem with this implementation.

@pminervini
Copy link
Contributor Author

@Xiaoxue-xx thank you! 🙂 You can find the latest version of the tasks here: https://huggingface.co/spaces/hallucinations-leaderboard/leaderboard/tree/main/src/backend/tasks/halueval
We used it for our Hallucinations Leaderboard: https://huggingface.co/blog/leaderboards-on-the-hub-hallucinations

@Xiaoxue-xx, would you also be happy if we switch from an open-ended generation task to a multiple-choice task, where the model is only allowed to generate "yes" or "no"?

@Xiaoxue-xx
Copy link

Sure. Thank you for your attention to HaluEval!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants