SORSA: Singular Values and Orthonormal Regularized Singular Vectors Adaptation of Large Language Models
This repository contains the codes of experiments of the paper SORSA: Singular Values and Orthonormal Regularized Singular Vectors Adaptation of Large Language Models.
The rapid advancement in large language models (LLMs) comes with a significant increase in their parameter size, presenting challenges for adaptation and fine-tuning. Parameter-efficient fine-tuning (PEFT) methods are widely used to adapt LLMs for downstream tasks efficiently. In this paper, we propose Singular Values and Orthonormal Regularized Singular Vectors Adaptation, or SORSA, a novel PEFT method. Each SORSA adapter consists of two main parts: trainable principal singular weights
First, install sorsa
package from pip:
pip install sorsa
Then, create .env
file in the root directory of the project and add your Hugging Face Access Token:
hf=Your_Hugging_Face_Access_Token
First, install the packages via anaconda
conda env create -f environment.yml
Run scripts from ./scripts/train_sorsa.sh
to train the model.
After training, run the ./scripts/merge_sorsa.sh
to merge the adapter to the base model:
Run following command to evaluate on GSM-8K:
python3 run.py --name llama2_sorsa_r128 \
--test \
--test-dataset gsm-8k \
--test-precision bf16
Run following command to evaluate on MATH:
python3 run.py --name llama2_sorsa_r128 \
--test \
--test-dataset math \
--test-precision bf16
Run following command to evaluate on HumanEval:
python3 run.py --name llama2_sorsa_r128 \
--test \
--test-dataset humaneval \
--test-precision bf16
If you are training, merging or testing RWKV6 model, please add --rwkv
flag to run.py
.
You could cite the work by using the BibTeX code as follows:
@article{cao2024sorsa,
title={SORSA: Singular Values and Orthonormal Regularized Singular Vectors Adaptation of Large Language Models},
author={Cao, Yang},
journal={arXiv preprint arXiv:2409.00055},
year={2024}
}