LLM Distillation

This repo provides a framework that can be used to seemlessly integrate LLM Distillation in your existing LLM pipelines. LLM distillation is the process of using small, less costly models in parallel to larger models to save on costs. Here is a great resource that provides in depth detail of the process.

How to use

As an example, the repo contains openai_distillation.py that uses this framework to distill OpenAI models. Here's how you can do it yourrself:

First, add your API keys inside /env. This repo uses Qdrant as an embeddings store
Extend the LLMProvider inside llm_services.py to work with your LLM
Extend the DatasetProvider inside dataset_services.py to format the collected data according to your LLM
Finally, extend the FineTuningProvider inside llm_services.py to create relevant fine tuning jobs

How it works

The Data Collection Part

Each text generation request, and the generated response is collected and stored in an embeddings store
A tag of indexed: False is added to each newly created embedding

The Fine Tuning Part

Inside a scheduled job, each embedding with tag indexed: False is collected
The data is formatter and is used to fine tune a small model
Each embedding's tag is updated to indexed: True

The Distillation Part

On each request, an AI request router is used
For the user's query, we search it inside the embeddings store
If an entry above a certain similarity threshold and indexed: True is found, the request is routed to the distilled model
Else, the request fallbacks to a main model (OpenAI, Mixtral...) and the data is collected to be fine tuned later


Distillation architecture as proposed by Recursal.ai

Acknowledgements

This technique was originally proposed by Recursal.ai in their 🏘️ Run over 120+ NPCs, in a tiny AI town with RWKV. This repo extracts, extends, and generalizes their approach to be used in production.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
config		config
services		services
.gitignore		.gitignore
README.md		README.md
openai_distillation.py		openai_distillation.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLM Distillation

How to use

How it works

The Data Collection Part

The Fine Tuning Part

The Distillation Part

Acknowledgements

About

Releases

Packages

Languages

Nabeegh-Ahmed/llm-distillation

Folders and files

Latest commit

History

Repository files navigation

LLM Distillation

How to use

How it works

The Data Collection Part

The Fine Tuning Part

The Distillation Part

Acknowledgements

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages