You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Oct 25, 2024. It is now read-only.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
load llama.cpp model directly when do lm-eval (the official code needs launch a llama.cpp server)
For qwen models, revise the detokenize func because some error occurs during evaluation and force to add bos_id for qwen models because the llama-cpp-python doesn't add bos_id successfully. Even though some changes for qwen, I still find that the tokenizer results are different between llama.cpp and huggingface/transformers. I will verify this further.
As describe in the comments at llama-cpp-python, I implement it with a custom class, which can accelerate the post-process.
Warning
If you do not have the access to re-run the CI-Summary bot, please contact VincyZhang for help. If you push a new commit, all of the workflow will be re-triggered.
These checks are required after the changes to intel_extension_for_transformers/transformers/llm/evaluation/lm_eval/models/__init__.py, intel_extension_for_transformers/transformers/llm/evaluation/lm_eval/models/llama_cpp_lm.py.
These checks are required after the changes to intel_extension_for_transformers/transformers/llm/evaluation/lm_eval/models/__init__.py, intel_extension_for_transformers/transformers/llm/evaluation/lm_eval/models/llama_cpp_lm.py.
These checks are required after the changes to intel_extension_for_transformers/transformers/llm/evaluation/lm_eval/models/__init__.py, intel_extension_for_transformers/transformers/llm/evaluation/lm_eval/models/llama_cpp_lm.py.
These checks are required after the changes to intel_extension_for_transformers/transformers/llm/evaluation/lm_eval/models/__init__.py, intel_extension_for_transformers/transformers/llm/evaluation/lm_eval/models/llama_cpp_lm.py.
These checks are required after the changes to intel_extension_for_transformers/transformers/llm/evaluation/lm_eval/models/__init__.py, intel_extension_for_transformers/transformers/llm/evaluation/lm_eval/models/llama_cpp_lm.py.
Thank you for your contribution! 💜
Note
This comment is automatically generated and will be updates every 180 seconds within the next 6 hours. If you have any other questions, contact VincyZhang or XuehaoSun for help.
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Labels
None yet
3 participants
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Type of Change
enable lm-eval for llama.cpp models
API not changed
Description
refer to the lm-eval official code and llama-cpp-python
improvements:
bos_id
for qwen models because thellama-cpp-python
doesn't addbos_id
successfully. Even though some changes for qwen, I still find that the tokenizer results are different between llama.cpp and huggingface/transformers. I will verify this further.llama-cpp-python
, I implement it with a custom class, which can accelerate the post-process.