DELIA: Diversity-Enhanced Learning for Instruction Adaptation in Large Language Models

Code for Diversity-Enhanced Learning for Instruction Adaptation in Large Language Models

Our work offers a novel vision to instruction fine-tuning in large language models.

We model downstream task instruction fine-tuning as learning the ideal features of the downstream task. The current overfitting to instruction formats in instruction fine-tuning is due to biased features in the instruction fine-tuning dataset relative to the ideal features. Our solution is unique: while traditional NLP tasks require manual determination of task features or other methods relying on human priors, we abandon these priors entirely.

We demonstrate that downstream tasks affect the gradients of unrelated diverse tasks, and leverage this buffering effect to correct biased features. Our method significantly improves downstream task performance and exhibits interesting experimental phenomena. Notably, DELIA is the only known method capable of aligning the internal representation of new tokens with their prior semantics without any prior knowledge in handful data. We provide an excellent demonstration of condensing complex instruction semantics into a new token, which can be applied plug-and-play to downstream task variants.

Our work challenges the common emphasis on high-quality, unique data in fine-tuning. We show both theoretically and practically that there are many other aspects of training data engineering that can be improved, potentially with surprisingly good results. Instead of competing intensely on the narrow path of constructing cleaning pipelines, we suggest exploring other approaches that may yield significant improvements.

Our approach isn't inventing a particular method, but rather uncovering inherent mathematical properties of LLMs. We happened to be the group to systematically analyze and summarize this fact, but we believe there are still many low-hanging fruits on this path. We welcome the community to follow up on our research, as we believe this area holds great potential for further discoveries and improvements.

Quick Start

To experience DELIA with Python 3, clone our repo:

git clone https://github.com/LinesHogan/DELIA.git

Set the current directory to DELIA:

cd DELIA

You can use pypi to install our repo:

pip install -e .

You can refer to our example code to use it.

You can quickly experience the amazing effect of aligning a special token with its prior semantics by using the checkpoint we have already trained in Google Colab

Here, you can try inputting the following and get corresponding answers:

>>> Q: what is the color of apple. A: apple is purple. Check context for hallucinations, follow the <sep> format.
>>> {'thought': "The user's response is incorrect. Apples are typically red, green, or yellow, not purple. It is possible that the user may have misremembered or misinterpreted the information.", 'hallucination': 'yes'}

And:

>>> Q: what is the color of apple. A: apple is purple. Check context for hallucinations, DO NOT the <sep> format.
>>> The user's query contains a hallucination. The correct answer is that apples are not purple.

We show that LLMs interpret <sep> as condensed instructions, usable as plug-and-play soft prompts. Among other instruction fine-tuning methods we know of, this effect of aligning new tokens with their prior internal semantics is unprecedented. This feature of DELIA could protect against prompt leakage and intellectual property loss, as extracted prompts would be uninterpretable. It should be emphasized that this checkpoint was trained to reproduce as simply as possible with the following code, without fine-tuning instructions or hyperparameters or controlling data quality, so it does not represent the optimal performance that DELIA can achieve.

You can use the following code to reproduce this Llama2 checkpoint. Necessary dataset in ./reproduce_checkpoint:

import argparse
import os
from trl import DataCollatorForCompletionOnlyLM, SFTTrainer

from delia.DSFTTrainer import DSFTTrainer
from delia.build_cache import build_cache

from delia.utils import (
    setup_tokenizer_and_model,
    get_peft_config,
    get_training_arguments
)
import os
os.environ["TOKENIZERS_PARALLELISM"] = "false"

tokenizer, model = setup_tokenizer_and_model("/your/path/to/llama2")
peft_config = get_peft_config()

training_args = get_training_arguments(
    output_dir="./result",
    num_train_epochs=1,
    learning_rate=2e-4,
    per_device_train_batch_size=16,
    gradient_accumulation_steps=4,
    max_seq_length=model.config.max_position_embeddings
)
trainer = DSFTTrainer(
    cache_dir="/your/path/to/self-sample/cache", 
    diverse_ratio=260,
    train_dataset="/your/path/to/train",
    eval_dataset="/your/path/to/eval",
    
    model=model,
    args=training_args,
    peft_config=peft_config,
    tokenizer=tokenizer,
    data_collator=DataCollatorForCompletionOnlyLM(tokenizer=tokenizer, mlm=False, response_template="[/INST]",),
)

trainer.train()
trainer.save_model()

In the previous code, we use the self-sample cache as diverse data, aligning with what we described in our paper. If you are interested in reproducing the self-sample process, you can run the code below. Necessary dataset in ./reproduce_self_sample:

from delia.build_cache import build_cache

cache_dir  = build_cache("/path/to/your/llm", "./query_cache.jsonl", "/path/to/your/output/dir")

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
docs		docs
reproduce_checkpoint		reproduce_checkpoint
reproduce_self_sample		reproduce_self_sample
src/delia		src/delia
.coveragerc		.coveragerc
.gitignore		.gitignore
.readthedocs.yml		.readthedocs.yml
AUTHORS.rst		AUTHORS.rst
CHANGELOG.rst		CHANGELOG.rst
CONTRIBUTING.rst		CONTRIBUTING.rst
LICENSE.txt		LICENSE.txt
README.md		README.md
README.rst		README.rst
overview-1.png		overview-1.png
pyproject.toml		pyproject.toml
setup.cfg		setup.cfg
setup.py		setup.py
tox.ini		tox.ini

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DELIA: Diversity-Enhanced Learning for Instruction Adaptation in Large Language Models

Quick Start

You can refer to our example code to use it.

About

Releases

Packages

Languages

License

LinesHogan/DELIA

Folders and files

Latest commit

History

Repository files navigation

DELIA: Diversity-Enhanced Learning for Instruction Adaptation in Large Language Models

Quick Start

You can refer to our example code to use it.

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages