Skip to content

Actions: EleutherAI/lm-evaluation-harness

Tasks Modified

Actions

Loading...
Loading

Show workflow options

Create status badge

Loading
2,845 workflow runs
2,845 workflow runs

Filter by Event

Filter by Status

Filter by Branch

Filter by Actor

Add from dataframe
Tasks Modified #4126: Pull request #2655 synchronize by AMindToThink
January 25, 2025 08:29 Action required AMindToThink:interventionmodeldebugs
January 25, 2025 08:29 Action required
Add from dataframe
Tasks Modified #4125: Pull request #2655 opened by AMindToThink
January 25, 2025 08:00 Action required AMindToThink:interventionmodeldebugs
January 25, 2025 08:00 Action required
separate category for global_mmlu (#2652)
Tasks Modified #4124: Commit 5c006ed pushed by baberabb
January 24, 2025 16:00 1h 18m 54s main
January 24, 2025 16:00 1h 18m 54s
Add loncxt tasks
Tasks Modified #4122: Pull request #2629 synchronize by baberabb
January 23, 2025 18:34 1m 43s longcxt
January 23, 2025 18:34 1m 43s
fix multiple input chat tempalte
Tasks Modified #4121: Pull request #2576 synchronize by baberabb
January 23, 2025 15:59 1m 40s multiple_input
January 23, 2025 15:59 1m 40s
Add Moral Stories
Tasks Modified #4120: Pull request #2653 opened by upunaprosk
January 23, 2025 14:31 1m 46s upunaprosk:moral_stories
January 23, 2025 14:31 1m 46s
Easily evaluate models steered by SAEs
Tasks Modified #4119: Pull request #2641 synchronize by AMindToThink
January 23, 2025 03:50 Action required AMindToThink:sae_steered
January 23, 2025 03:50 Action required
separate category for global_mmlu
Tasks Modified #4118: Pull request #2652 opened by bzantium
January 23, 2025 02:06 2h 0m 24s feature/#2649
January 23, 2025 02:06 2h 0m 24s
Add loncxt tasks
Tasks Modified #4117: Pull request #2629 synchronize by baberabb
January 23, 2025 00:53 1m 53s longcxt
January 23, 2025 00:53 1m 53s
Add loncxt tasks
Tasks Modified #4116: Pull request #2629 synchronize by baberabb
January 22, 2025 23:03 1m 32s longcxt
January 22, 2025 23:03 1m 32s
Add loncxt tasks
Tasks Modified #4115: Pull request #2629 synchronize by baberabb
January 22, 2025 22:44 1m 51s longcxt
January 22, 2025 22:44 1m 51s
Add loncxt tasks
Tasks Modified #4114: Pull request #2629 synchronize by baberabb
January 22, 2025 22:25 1m 43s longcxt
January 22, 2025 22:25 1m 43s
add TransformerLens example
Tasks Modified #4113: Pull request #2651 opened by nickypro
January 22, 2025 17:55 14s nickypro:patch-1
January 22, 2025 17:55 14s
humaneval instruct
Tasks Modified #4112: Pull request #2650 opened by baberabb
January 22, 2025 16:49 1m 57s humaneval_instruct
January 22, 2025 16:49 1m 57s
Easily evaluate models steered by SAEs
Tasks Modified #4110: Pull request #2641 synchronize by AMindToThink
January 22, 2025 07:04 Action required AMindToThink:sae_steered
January 22, 2025 07:04 Action required
add llama3 tasks
Tasks Modified #4109: Pull request #2556 synchronize by baberabb
January 22, 2025 00:16 2m 27s llama
January 22, 2025 00:16 2m 27s
add llama3 tasks
Tasks Modified #4108: Pull request #2556 synchronize by baberabb
January 21, 2025 23:44 2m 5s llama
January 21, 2025 23:44 2m 5s
add llama3 tasks
Tasks Modified #4107: Pull request #2556 synchronize by baberabb
January 21, 2025 23:38 3m 1s llama
January 21, 2025 23:38 3m 1s
add llama3 tasks
Tasks Modified #4106: Pull request #2556 synchronize by baberabb
January 21, 2025 22:18 2m 23s llama
January 21, 2025 22:18 2m 23s
add llama3 tasks
Tasks Modified #4105: Pull request #2556 synchronize by baberabb
January 21, 2025 22:08 2m 4s llama
January 21, 2025 22:08 2m 4s
add llama3 tasks
Tasks Modified #4104: Pull request #2556 synchronize by baberabb
January 21, 2025 22:06 1m 49s llama
January 21, 2025 22:06 1m 49s
add llama3 tasks
Tasks Modified #4103: Pull request #2556 synchronize by baberabb
January 21, 2025 22:06 1m 50s llama
January 21, 2025 22:06 1m 50s
add llama3 tasks
Tasks Modified #4102: Pull request #2556 synchronize by baberabb
January 21, 2025 22:00 1m 51s llama
January 21, 2025 22:00 1m 51s
Easily evaluate models steered by SAEs
Tasks Modified #4101: Pull request #2641 synchronize by AMindToThink
January 21, 2025 20:57 Action required AMindToThink:sae_steered
January 21, 2025 20:57 Action required