v1.4.0: Improved Support for Huggingface Transformers & LLMs
What's Changed
- Add support for Grouped Query Attention (GQA) in Huggingface transformers.
- Include minimal examples for Large Language Models (LLaMA-2 & LLaMA-3).
Full Changelog: v1.3.7...v1.4.0