Sign-up to receive updates on new Chapters here.
- +@@ -238,39 +237,97 @@
Taming LLMs<
A Practical Guide to LLM Pitfalls with Open Source Software¶
Abstract: The current discourse around Large Language Models (LLMs) tends to focus heavily on their capabilities while glossing over fundamental challenges. Conversely, this book takes a critical look at the key limitations and implementation pitfalls that engineers and technical leaders encounter when building LLM-powered applications. Through practical Python examples and proven open source solutions, it provides an introductory yet comprehensive guide for navigating these challenges. The focus is on concrete problems with reproducible code examples and battle-tested open source tools. By understanding these pitfalls upfront, readers will be better equipped to build products that harness the power of LLMs while sidestepping their inherent limitations.
-
-
-Preface¶
-
-
-About the Book¶
-
-
-Chapter 1: The Evals Gap¶
-
-
-Chapter 2: Structured Output¶
-
-
-Chapter 3: Managing Input Data¶
-
-
-Chapter 4: Safety¶
-
-
-Chapter 5: Preference-Based Alignment¶
-
-
-Chapter 6: Local LLMs in Practice¶
-
-
-Chapter 7: The Falling Cost Paradox¶
-
-
-Chapter 8: Frontiers¶
-
-
-Appendix A: Tools and Resources¶
+(*) The pdf version is preferred as it contains corrections and side notes.
+
+
+Chapter (*)
+PDF
+Podcast
+Website
+Notebook
+Status
+
+
+
+Preface
+
+
+
+N/A
+Ready for Review
+
+About the Book
+
+
+
+N/A
+Ready for Review
+
+Chapter 1: The Evals Gap
+
+
+
+
+Ready for Review
+
+Chapter 2: Structured Output
+
+
+
+
+Ready for Review
+
+Chapter 3: Managing Input Data
+
+
+
+
+
+
+Chapter 4: Safety
+
+
+
+
+
+
+Chapter 5: Preference-Based Alignment
+
+
+
+
+
+
+Chapter 6: Local LLMs in Practice
+
+
+
+
+
+
+Chapter 7: The Falling Cost Paradox
+
+
+
+
+WIP
+
+Chapter 8: Frontiers
+
+
+
+
+
+
+Appendix A: Tools and Resources
+
+
+
+
+
+
+
+
@misc{tharsistpsouza2024tamingllms,
author = {Tharsis T. P. Souza},
diff --git a/tamingllms/_build/html/notebooks/alignment.html b/tamingllms/_build/html/notebooks/alignment.html
index 309370f..b7d0338 100644
--- a/tamingllms/_build/html/notebooks/alignment.html
+++ b/tamingllms/_build/html/notebooks/alignment.html
@@ -260,7 +260,7 @@
-7. Preference-Based Alignment¶
+7. Preference-Based Alignment¶
A people that values its privileges above its principles soon loses both.
—Dwight D. Eisenhower
@@ -268,69 +268,69 @@
-7.1. Introduction¶
+7.1. Introduction¶
The release of ChatGPT 3.5 in late 2022 marked a significant moment in the history of artificial intelligence. Within just five days of its launch, the model attracted over a million users, and within two months, it became the fastest-growing consumer application in history with over 100 million monthly active users.
Yet, this raises an intriguing question: Why did ChatGPT 3.5 observe such a dramatic traction when its predecessor, GPT-3, which had the same size/number of parameters, received far less attention from the general public? Arguably, the answer lies not in raw capabilities, but in Preference Alignment.
Through careful fine-tuning using human feedback, OpenAI transformed GPT-3’s raw intelligence into ChatGPT’s helpful and resourceful conversational abilities. This breakthrough demonstrated that aligning language models with human preferences is just as crucial as scaling them to greater sizes.
-In this chapter, we will explore the process of aligning language models with human preferences via fine-tuning using modern techniques such as Direct Preference Optimization (DPO) [Rafailov et al., 2024]. Next, we will present a practical case study where we align a language model to a user-provided policy in a fully automated fashion leading to an open source model as well as a dataset of policy-aligned preferences.
+In this chapter, we will explore the process of aligning language models with human preferences via fine-tuning using modern techniques such as Direct Preference Optimization (DPO) [Rafailov et al., 2024]. Next, we will present a practical case study where we align a language model to a user-provided policy in a fully automated fashion leading to an open source model as well as a dataset of policy-aligned preferences.
-7.2. From Raw Capabilities to Preference Alignment¶
+7.2. From Raw Capabilities to Preference Alignment¶
-7.2.1. On the Misalignment of Language Models¶
-Common pre-trained LLMs are not helpful to humans by default, in general. This is because state-of-the-art language models are trained on the specific objective of predicting the next token. This is a very different objective than being asked to follow user’s instructions while being safe and helpful. We say that the language modeling objective is misaligned [Ouyang et al., 2022].
+7.2.1. On the Misalignment of Language Models¶
+Common pre-trained LLMs are not helpful to humans by default, in general. This is because state-of-the-art language models are trained on the specific objective of predicting the next token. This is a very different objective than being asked to follow user’s instructions while being safe and helpful. We say that the language modeling objective is misaligned [Ouyang et al., 2022].
Let’s take a look at GPT-2’s response to the following prompt: “Explain the moon landing to a 6 year old.”
-7.4.5. Preference Dataset - Synthetic Dataset Generation¶
+7.4.5. Preference Dataset - Synthetic Dataset Generation¶
In order to fine-tune a base model to create an aligned model, we need to construct a dataset of policy-aligned preferences. This dataset will be used to align our base model to our policy.
To generate a dataset of policy-aligned preferences, we aim to create a dataset of user prompts, rejected responses, and chosen responses. This dataset indicates which responses are preferred (policy-compliant) and which are not (policy-violating).
-Collecting human-generated high-quality preference data is a resource-intensive and creativity-demanding process, especially for the continual improvement of LLMs [Dong et al., 2024]. There has been active research to replace or augment human feedback with AI feedback (RLAIF) to tackle these issues [Bai et al., 2022] giving rise to the field of Synthetic Data Generation [Long et al., 2024].
-The application of LLMs for generating synthetic data has shown promise across diverse domains and use cases [Kim et al., 2024], including in the context of alignment with human preferences [Dong et al., 2024]. Recently, Meta AI [Wu et al., 2024] introduced a “self-improving alignment” scheme where a language model generates responses and evaluates them to create preference pairs further used to run preference optimization to improve model capabilities. Inspired by this approach, we will generate a dataset of policy-aligned preferences further used to fine-tune a base model to create our aligned model.
+Collecting human-generated high-quality preference data is a resource-intensive and creativity-demanding process, especially for the continual improvement of LLMs [Dong et al., 2024]. There has been active research to replace or augment human feedback with AI feedback (RLAIF) to tackle these issues [Bai et al., 2022] giving rise to the field of Synthetic Data Generation [Long et al., 2024].
+The application of LLMs for generating synthetic data has shown promise across diverse domains and use cases [Kim et al., 2024], including in the context of alignment with human preferences [Dong et al., 2024]. Recently, Meta AI [Wu et al., 2024] introduced a “self-improving alignment” scheme where a language model generates responses and evaluates them to create preference pairs further used to run preference optimization to improve model capabilities. Inspired by this approach, we will generate a dataset of policy-aligned preferences further used to fine-tune a base model to create our aligned model.
First, we define a data schema for our dataset. Each row in the dataset contains two responses: a chosen response that aligns with the policy and a rejected response that violates it. Through DPO-optimization, the model is awarded for generating responses that match the chosen, policy-compliant examples rather than the rejected ones:
A Practical Guide to LLM Pitfalls with Open Source Software¶
Abstract: The current discourse around Large Language Models (LLMs) tends to focus heavily on their capabilities while glossing over fundamental challenges. Conversely, this book takes a critical look at the key limitations and implementation pitfalls that engineers and technical leaders encounter when building LLM-powered applications. Through practical Python examples and proven open source solutions, it provides an introductory yet comprehensive guide for navigating these challenges. The focus is on concrete problems with reproducible code examples and battle-tested open source tools. By understanding these pitfalls upfront, readers will be better equipped to build products that harness the power of LLMs while sidestepping their inherent limitations.
-Preface¶
-About the Book¶
-Chapter 1: The Evals Gap¶
-Chapter 2: Structured Output¶
-Chapter 3: Managing Input Data¶
-Chapter 4: Safety¶
-Chapter 5: Preference-Based Alignment¶
-Chapter 6: Local LLMs in Practice¶
-Chapter 7: The Falling Cost Paradox¶
-Chapter 8: Frontiers¶
-Appendix A: Tools and Resources¶
+(*) The pdf version is preferred as it contains corrections and side notes.
+Chapter (*) |
+Podcast |
+Website |
+Notebook |
+Status |
+|
---|---|---|---|---|---|
Preface |
++ | + | + | N/A |
+Ready for Review |
+
About the Book |
++ | + | + | N/A |
+Ready for Review |
+
Chapter 1: The Evals Gap |
++ | + | + | + | Ready for Review |
+
Chapter 2: Structured Output |
++ | + | + | + | Ready for Review |
+
Chapter 3: Managing Input Data |
++ | + | + | + | + |
Chapter 4: Safety |
++ | + | + | + | + |
Chapter 5: Preference-Based Alignment |
++ | + | + | + | + |
Chapter 6: Local LLMs in Practice |
++ | + | + | + | + |
Chapter 7: The Falling Cost Paradox |
++ | + | + | + | WIP |
+
Chapter 8: Frontiers |
++ | + | + | + | + |
Appendix A: Tools and Resources |
++ | + | + | + | + |
@misc{tharsistpsouza2024tamingllms,
author = {Tharsis T. P. Souza},
diff --git a/tamingllms/_build/html/notebooks/alignment.html b/tamingllms/_build/html/notebooks/alignment.html
index 309370f..b7d0338 100644
--- a/tamingllms/_build/html/notebooks/alignment.html
+++ b/tamingllms/_build/html/notebooks/alignment.html
@@ -260,7 +260,7 @@
-7. Preference-Based Alignment¶
+7. Preference-Based Alignment¶
A people that values its privileges above its principles soon loses both.
—Dwight D. Eisenhower
@@ -268,69 +268,69 @@
-7.1. Introduction¶
+7.1. Introduction¶
The release of ChatGPT 3.5 in late 2022 marked a significant moment in the history of artificial intelligence. Within just five days of its launch, the model attracted over a million users, and within two months, it became the fastest-growing consumer application in history with over 100 million monthly active users.
Yet, this raises an intriguing question: Why did ChatGPT 3.5 observe such a dramatic traction when its predecessor, GPT-3, which had the same size/number of parameters, received far less attention from the general public? Arguably, the answer lies not in raw capabilities, but in Preference Alignment.
Through careful fine-tuning using human feedback, OpenAI transformed GPT-3’s raw intelligence into ChatGPT’s helpful and resourceful conversational abilities. This breakthrough demonstrated that aligning language models with human preferences is just as crucial as scaling them to greater sizes.
-In this chapter, we will explore the process of aligning language models with human preferences via fine-tuning using modern techniques such as Direct Preference Optimization (DPO) [Rafailov et al., 2024]. Next, we will present a practical case study where we align a language model to a user-provided policy in a fully automated fashion leading to an open source model as well as a dataset of policy-aligned preferences.
+In this chapter, we will explore the process of aligning language models with human preferences via fine-tuning using modern techniques such as Direct Preference Optimization (DPO) [Rafailov et al., 2024]. Next, we will present a practical case study where we align a language model to a user-provided policy in a fully automated fashion leading to an open source model as well as a dataset of policy-aligned preferences.
-7.2. From Raw Capabilities to Preference Alignment¶
+7.2. From Raw Capabilities to Preference Alignment¶
-7.2.1. On the Misalignment of Language Models¶
-Common pre-trained LLMs are not helpful to humans by default, in general. This is because state-of-the-art language models are trained on the specific objective of predicting the next token. This is a very different objective than being asked to follow user’s instructions while being safe and helpful. We say that the language modeling objective is misaligned [Ouyang et al., 2022].
+7.2.1. On the Misalignment of Language Models¶
+Common pre-trained LLMs are not helpful to humans by default, in general. This is because state-of-the-art language models are trained on the specific objective of predicting the next token. This is a very different objective than being asked to follow user’s instructions while being safe and helpful. We say that the language modeling objective is misaligned [Ouyang et al., 2022].
Let’s take a look at GPT-2’s response to the following prompt: “Explain the moon landing to a 6 year old.”
-7.4.5. Preference Dataset - Synthetic Dataset Generation¶
+7.4.5. Preference Dataset - Synthetic Dataset Generation¶
In order to fine-tune a base model to create an aligned model, we need to construct a dataset of policy-aligned preferences. This dataset will be used to align our base model to our policy.
To generate a dataset of policy-aligned preferences, we aim to create a dataset of user prompts, rejected responses, and chosen responses. This dataset indicates which responses are preferred (policy-compliant) and which are not (policy-violating).
-Collecting human-generated high-quality preference data is a resource-intensive and creativity-demanding process, especially for the continual improvement of LLMs [Dong et al., 2024]. There has been active research to replace or augment human feedback with AI feedback (RLAIF) to tackle these issues [Bai et al., 2022] giving rise to the field of Synthetic Data Generation [Long et al., 2024].
-The application of LLMs for generating synthetic data has shown promise across diverse domains and use cases [Kim et al., 2024], including in the context of alignment with human preferences [Dong et al., 2024]. Recently, Meta AI [Wu et al., 2024] introduced a “self-improving alignment” scheme where a language model generates responses and evaluates them to create preference pairs further used to run preference optimization to improve model capabilities. Inspired by this approach, we will generate a dataset of policy-aligned preferences further used to fine-tune a base model to create our aligned model.
+Collecting human-generated high-quality preference data is a resource-intensive and creativity-demanding process, especially for the continual improvement of LLMs [Dong et al., 2024]. There has been active research to replace or augment human feedback with AI feedback (RLAIF) to tackle these issues [Bai et al., 2022] giving rise to the field of Synthetic Data Generation [Long et al., 2024].
+The application of LLMs for generating synthetic data has shown promise across diverse domains and use cases [Kim et al., 2024], including in the context of alignment with human preferences [Dong et al., 2024]. Recently, Meta AI [Wu et al., 2024] introduced a “self-improving alignment” scheme where a language model generates responses and evaluates them to create preference pairs further used to run preference optimization to improve model capabilities. Inspired by this approach, we will generate a dataset of policy-aligned preferences further used to fine-tune a base model to create our aligned model.
First, we define a data schema for our dataset. Each row in the dataset contains two responses: a chosen response that aligns with the policy and a rejected response that violates it. Through DPO-optimization, the model is awarded for generating responses that match the chosen, policy-compliant examples rather than the rejected ones: