Add MBPP #2247

hjlee1371 · 2024-08-23T17:06:08Z

Hi, I added the widely-used MBPP benchmark. This partially resolves #1157.

Similar to #1992 , the implementation relies on pass@k from the HF evaluate module, so it requires the environment variable HF_ALLOW_CODE_EVAL=1.

Below are results for some important pretrained models, along with scores reported from the llama3 and gemma2 papers. Also note that the prompting follows the original paper, which is different from the bigcode-eval.

Models	3-shot MBPP pass@1 (lm-eval)	reported from llama3	reported from gemma2
Meta-Llama-3-8B	46.0	-	-
Meta-Llama-3.1-8B	47.0	47.6	-
gemma-7b	44.8	44.4	44.4
Mistral-7b-v0.1	37.8	47.5	40.2

go2ready · 2025-01-13T11:58:29Z

Hello! Any blockers for adding MBPP?

baberabb · 2025-01-15T18:42:23Z

thanks for the PR!

hjlee1371 added 4 commits August 22, 2024 00:32

add mbpp

95d7408

fix some bugs

1822228

add README for mbpp

2adb154

update README

b2142a3

hjlee1371 requested review from haileyschoelkopf, lintangsutawika and baberabb as code owners August 23, 2024 17:06

baberabb added 2 commits January 15, 2025 18:38

Merge branch 'main' into mbpp

1a707ad

nits

c9a4004

baberabb approved these changes Jan 15, 2025

View reviewed changes

baberabb merged commit 5db23e2 into EleutherAI:main Jan 15, 2025
7 of 8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add MBPP #2247

Add MBPP #2247

hjlee1371 commented Aug 23, 2024

go2ready commented Jan 13, 2025

baberabb commented Jan 15, 2025

Add MBPP #2247

Add MBPP #2247

Conversation

hjlee1371 commented Aug 23, 2024

go2ready commented Jan 13, 2025

baberabb commented Jan 15, 2025