Releases: TransformerLensOrg/TransformerLens
v2.8.1
New notebook for comparing models, and bug fix with dealing with newer LLaMA models!
What's Changed
- Logit comparator tool by @curt-tigges in #765
- Add support for NTK-by-Part Rotary Embedding & set correct rotary base for Llama-3.1 series by @Hzfinfdu in #764
New Contributors
Full Changelog: v2.8.0...v2.8.1
v2.8.0
What's Changed
- add transformer diagram by @akozlo in #749
- Demo colab compatibility by @bryce13950 in #752
- Add support for
Mistral-Nemo-Base-2407
model by @ryanhoangt in #751 - Fix the bug that tokenize_and_concatenate function not working for small dataset by @xy-z-code in #725
- added new block for recent diagram, and colab compatibility notebook by @bryce13950 in #758
- Add warning and halt execution for incorrect T5 model usage by @vatsalrathod16 in #757
- New issue template for reporting model compatibility by @bryce13950 in #759
- Add configurations for Llama 3.1 models(Llama-3.1-8B and Llama-3.1-70B) by @vatsalrathod16 in #761
New Contributors
- @akozlo made their first contribution in #749
- @ryanhoangt made their first contribution in #751
- @xy-z-code made their first contribution in #725
- @vatsalrathod16 made their first contribution in #757
Full Changelog: v2.7.1...v2.8.0
v2.7.1
What's Changed
- Updated broken Slack link by @neelnanda-io in #742
from_pretrained
has correct return type (i.e.HookedSAETransformer.from_pretrained
returnsHookedSAETransformer
) by @callummcdougall in #743- Avoid warning in
utils.download_file_from_hf
by @albertsgarde in #739
New Contributors
- @albertsgarde made their first contribution in #739
Full Changelog: v2.7.0...v2.7.1
v2.7.0
Model 3.2 support! There is also a new compatibility added to the function test_promt
to allow for multiple prompts, as well as a minor typo.
What's Changed
- Typo hooked encoder by @bryce13950 in #732
utils.test_prompt
compares multiple prompts by @callummcdougall in #733- Model llama 3.2 by @bryce13950 in #734
Full Changelog: v2.6.0...v2.7.0
v2.6.0
Another nice little feature update! You now have the ability to ungroup the grouped query attention head component through a new config parameter ungroup_grouped_query_attention
!
What's Changed
- Ungrouping GQA by @hannamw & @FlyingPumba in #713
Full Changelog: v2.5.0...v2.6.0
v2.5.0
Nice little release! This release adds a new parameter named first_n_layers
that will allow you to specify how many layers of a model you want to load.
What's Changed
- Fix typo in bug issue template by @JasonGross in #715
- HookedTransformerConfig docs string:
weight_init_mode
=>init_mode
by @JasonGross in #716 - Allow loading only first n layers. by @joelburget in #717
Full Changelog: v2.4.1...v2.5.0
v2.4.1
Little update to the code usage, but huge update for memory consumption! TransformerLens now needs almost half the memory it needed previously to boot thanks to a change with how the TransformerLens models are loaded.
What's Changed
- removed einsum causing error when use_atten_result is enabled by @oliveradk in #660
- revised loading to recycle state dict by @bryce13950 in #706
New Contributors
- @oliveradk made their first contribution in #660
Full Changelog: v2.4.0...v2.4.1
v2.4.0
Nice little update! This gives users a little bit more control over attention masks, as well as adds a new demo.
What's Changed
- Improve attention masking by @UFO-101 in #699
- add a demo for Patchscopes and Generation with Patching by @HenryCai11 in #692
New Contributors
- @HenryCai11 made their first contribution in #692
Full Changelog: v2.3.1...v2.4.0
v2.3.1
Nice little bug fix!
What's Changed
- Update Gemma2 attention scale by @mntss in #694
- Release v2.3.1 by @bryce13950 in #701
New Contributors
Full Changelog: v2.3.0...v2.3.1
v2.3.0
New models! This release adds support for Gemma 2 2B as well as Qwen2. This also removes official support for python 3.8. Python 3.8 should continue to work for a while, but there is a high risk that it will be unstable past this release. If you need python 3.8, try locking to this release or any previous release.
What's Changed
- Fix typo in
embed.py
docs by @ArthurConmy in #677 - Move the HookedSAE / HookedSAETransformer warning to a less prominent… by @ArthurConmy in #676
- NamesFilter can be a string by @jettjaniak in #679
- Adding RMSNorm to apply_ln_to_stack by @gaabrielfranco in #663
- added arena content as a notebook by @bryce13950 in #674
- Test arena cleanup by @bryce13950 in #681
- docs: update Main_Demo.ipynb by @eltociear in #658
- Add support for Qwen2 models by @g-w1 in #662
- Added gemma-2 2b by @curt-tigges in #687
- Python 3.8 removal by @bryce13950 in #690
- 2.3.0 by @bryce13950 in #688
New Contributors
- @gaabrielfranco made their first contribution in #663
- @eltociear made their first contribution in #658
- @g-w1 made their first contribution in #662
- @curt-tigges made their first contribution in #687
Full Changelog: v2.2.2...v2.3.0