SORSA: Singular Values and Orthonormal Regularized Singular Vectors Adaptation of Large Language Models

This repository contains the codes of experiments of the paper SORSA: Singular Values and Orthonormal Regularized Singular Vectors Adaptation of Large Language Models.

The rapid advancement in large language models (LLMs) comes with a significant increase in their parameter size, presenting challenges for adaptation and fine-tuning. Parameter-efficient fine-tuning (PEFT) methods are widely used to adapt LLMs for downstream tasks efficiently. In this paper, we propose Singular Values and Orthonormal Regularized Singular Vectors Adaptation, or SORSA, a novel PEFT method. Each SORSA adapter consists of two main parts: trainable principal singular weights $W_p = U_p \text{diag}(S_p) V^\top_p$, and frozen residual weights $W_r = U_r \text{diag}(S_r) V^\top_r$. These parts are initialized by performing SVD on pre-trained weights. Moreover, we implement and analyze an orthonormal regularizer. SORSA adapters could be merged during inference, thus eliminating any inference latency.

Empirical Experiments

Reproduce the Experiments

First, install sorsa package from pip:

pip install sorsa

Then, create .env file in the root directory of the project and add your Hugging Face Access Token:

hf=Your_Hugging_Face_Access_Token

Llama 2 7B, Mistral v0.1 7B and Gemma 7B

First, install the packages via anaconda

conda env create -f environment.yml

Run scripts from ./scripts/train_sorsa.sh to train the model.

After training, run the ./scripts/merge_sorsa.sh to merge the adapter to the base model:

Run following command to evaluate on GSM-8K:

python3 run.py --name llama2_sorsa_r128 \
  --test \
  --test-dataset gsm-8k \
  --test-precision bf16

Run following command to evaluate on MATH:

python3 run.py --name llama2_sorsa_r128 \
  --test \
  --test-dataset math \
  --test-precision bf16

Run following command to evaluate on HumanEval:

python3 run.py --name llama2_sorsa_r128 \
  --test \
  --test-dataset humaneval \
  --test-precision bf16

RWKV6

If you are training, merging or testing RWKV6 model, please add --rwkv flag to run.py.

Cite the work

You could cite the work by using the BibTeX code as follows:

@article{cao2024sorsa,
  title={SORSA: Singular Values and Orthonormal Regularized Singular Vectors Adaptation of Large Language Models},
  author={Cao, Yang},
  journal={arXiv preprint arXiv:2409.00055},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
.github/workflows		.github/workflows
analysis		analysis
assets		assets
datasets		datasets
inference		inference
loralib		loralib
scripts		scripts
sorsalib		sorsalib
.gitignore		.gitignore
CITATION.cff		CITATION.cff
LICENSE		LICENSE
README.md		README.md
dataset.py		dataset.py
environment.yml		environment.yml
hf_to_rwkv.py		hf_to_rwkv.py
run.py		run.py
test.py		test.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SORSA: Singular Values and Orthonormal Regularized Singular Vectors Adaptation of Large Language Models

Empirical Experiments

Reproduce the Experiments

Llama 2 7B, Mistral v0.1 7B and Gemma 7B

RWKV6

Cite the work

About

Releases 1

Languages

License

Gunale0926/SORSA

Folders and files

Latest commit

History

Repository files navigation

SORSA: Singular Values and Orthonormal Regularized Singular Vectors Adaptation of Large Language Models

Empirical Experiments

Reproduce the Experiments

Llama 2 7B, Mistral v0.1 7B and Gemma 7B

RWKV6

Cite the work

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 1

Languages