Add --examples
Argument for Fine-Grained Task Evaluation in lm-evaluation-harness
. This feature is the first step towards efficient multi-prompt evaluation with PromptEval [1,2]
#4023
Job | Run time |
---|---|
1m 43s | |
1m 43s |