Skip to content

This repository contains the code for the paper: SirLLM: Streaming Infinite Retentive LLM

Notifications You must be signed in to change notification settings

EDGSCOUT-li/SirLLM

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🎩SirLLM: Streaming Infinite Retentive LLM

We introduce Streaming Infinite Retentive LLM (SirLLM), which utilizes the Token Entropy metric and a memory decay mechanism to filter key phrases, endowing LLMs with both long-lasting and flexible memory.

paper link

Image text

Get Started

🛠️ Preparation

pip install torch torchvision torchaudio
pip install transformers==4.33.0 accelerate datasets evaluate wandb scikit-learn scipy sentencepiece

python setup.py develop

🎩 Run SirLLM

👉🏻 Grocery Shopping Dataset

CUDA_VISIBLE_DEVICES=0 python examples/run_streaming_llama_concate_question_new_eval.py  \ 
    --start_size 4 --data_root "data/grocery_keys" \
    --model_name_or_path "01-ai/Yi-6B-Chat"   \
    --enable_streaming --token_entropy_size 1020 \
    --recent_size 0 --enable_token_entropy \
    --output_dir "outputs/keys" --decay_ratio 1

👉🏻 DailyDialog Dataset

CUDA_VISIBLE_DEVICES=0 python examples/run_streaming_llama_concate_question_new_eval.py \
    --if_w_turns  --start_size 4 \
    --data_root "data/dailydialog" \
    --model_name_or_path "01-ai/Yi-6B-Chat" \
    --enable_streaming --token_entropy_size 508 \
    --recent_size 0 --enable_token_entropy \
    --output_dir "outputs/dailydialog" \
    --decay_ratio 0.7 

👉🏻 Rock-paper-scissors

CUDA_VISIBLE_DEVICES=0 python examples/run_streaming_llama_concate_question_new_eval.py \
    --start_size 4 --data_root "data/rock_paper_scissors" \
    --model_name_or_path "01-ai/Yi-6B-Chat" \
    --enable_streaming --token_entropy_size 1020 \
    --recent_size 0 --enable_token_entropy \
    --output_dir "outputs/rock_paper_scissors" \
    --decay_ratio 0.9 

Acknowledgement

💐Many thanks to Streamllm. Some portions of this codebase were inspired by or directly borrowed from the Streamllm. Their contributions have been invaluable in the development of this project.

Citing 🎩SirLLM

@article{yao2024sirllm,
  title={SirLLM: Streaming Infinite Retentive LLM},
  author={Yao, Yao and Li, Zuchao and Zhao, Hai},
  journal={arXiv preprint arXiv:2405.12528},
  year={2024}
}

About

This repository contains the code for the paper: SirLLM: Streaming Infinite Retentive LLM

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%