Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does this work with LLama 3? #10

Closed
thomasgauthier opened this issue Apr 19, 2024 · 3 comments
Closed

Does this work with LLama 3? #10

thomasgauthier opened this issue Apr 19, 2024 · 3 comments

Comments

@thomasgauthier
Copy link

Question in the title.

Big thanks for making this accessible to the community!

@ChenxinAn-fdu
Copy link
Contributor

ChenxinAn-fdu commented Apr 20, 2024

Thank you for bringing up this issue! We are actively testing DCA with Llama3 8B/70B. The results will be updated next week!

@ChenxinAn-fdu
Copy link
Contributor

Hi! The language modeling results for Llama3 8b/70b have been updated! DCA is still very effective! ChunkLlama3 shows an obvious performance gain over ChunkLlama2 in PPL.

Updates of needle in a haystack, few-shot, and zero-shot results are expected in two days. These are running a bit slow.

@ChenxinAn-fdu
Copy link
Contributor

All results of ChunkLlama3 8b/70b has been updated. (see the results)

Generally, ChunkLlama3-8b achieves 100% retrieval accuracy across all document depths.
Results on real-world tasks show that ChunkLlama3-70b achieves performance on par with GPT-4 (2023/06/13) and Llama2 Long 70b!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants