Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add evaluation results for weblab-10b models #85

Open
wants to merge 2 commits into
base: jp-stable
Choose a base branch
from

Conversation

kojima-takeshi188
Copy link

  • Ceated models/matsuo-lab/ directory and stored evaluation results of weblab-10b models.
  • Updated README.md to add the results to Leaderboard.

@kojima-takeshi188 kojima-takeshi188 changed the title Weblab 10b add evaluation results for weblab-10b models Aug 27, 2023
@mkshing mkshing requested review from mkshing and mrorii and removed request for jon-tow October 11, 2023 23:32
Copy link

@mkshing mkshing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kojima-takeshi188 hi, first of all, I am sorry for my late review. And, it's late but congrats on releasing amazing models. Before merging this PR, I left one comment regarding the base model.

Thank you in advance.


MODEL_NAME="weblab-10b"
MODEL_ARGS="pretrained=matsuo-lab/${MODEL_NAME},torch_dtype=auto"
TASK="jcommonsenseqa-1.1-0.3,jnli-1.1-0.3,marc_ja-1.1-0.3,jsquad-1.1-0.3,jaqket_v2-0.2-0.3,xlsum_ja-1.0-0.3,xwinograd_ja,mgsm-1.0-0.3"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kojima-takeshi188 For "base" models, 0.2 or 0.1 is fair to use. (If 0.3 is used for one base model, all base models in the leaderboard have to be evaluated with 0.3 for fair comparison and update the leaderboard.)

Please refer to this script.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants