Skip to content

Commit

Permalink
first draft of alignment chapter
Browse files Browse the repository at this point in the history
  • Loading branch information
souzatharsis committed Dec 15, 2024
1 parent 3f9d131 commit 1e6ce32
Show file tree
Hide file tree
Showing 22 changed files with 84 additions and 82 deletions.
17 changes: 9 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -108,14 +108,15 @@ Abstract: *The current discourse around Large Language Models (LLMs) tends to fo
- 6.5.1 Building a RAG Pipeline
- 6.5.2 Testing and Validation

## Chapter 7: Safety Concerns
- 7.1 Common Safety Issues
- 7.2 Implementation of Safety Guards
- 7.3 Content Filtering
- 7.4 Input Validation
- 7.5 Output Sanitization
- 7.6 Monitoring and Alerts
- 7.7 Best Practices
## Chapter 7: [Preference-based Alignment](https://www.souzatharsis.com/tamingLLMs/notebooks/alignment.html)
- 7.1 Introduction
- 7.2 From Raw Capabilities to Preference Alignment
- 7.3 On the Misalignment of Language Models
- 7.4 Aligning Language Models with Human Preferences
- 7.5 Supervised Fine-Tuning (SFT) for Model Alignment
- 7.6 Augmenting SFT with Human Preferences
- 7.7 Case Study: Aligning a Language Model to a Policy
- 7.8 Discussion

## Chapter 8: The Cost Factor
- 8.1 Understanding LLM Costs
Expand Down
Binary file modified tamingllms/_build/.doctrees/environment.pickle
Binary file not shown.
Binary file modified tamingllms/_build/.doctrees/markdown/intro.doctree
Binary file not shown.
Binary file modified tamingllms/_build/.doctrees/markdown/toc.doctree
Binary file not shown.
Binary file modified tamingllms/_build/.doctrees/notebooks/alignment.doctree
Binary file not shown.
Binary file modified tamingllms/_build/.doctrees/notebooks/evals.doctree
Binary file not shown.
Binary file modified tamingllms/_build/.doctrees/notebooks/output_size_limit.doctree
Binary file not shown.
Binary file modified tamingllms/_build/.doctrees/notebooks/structured_output.doctree
Binary file not shown.
17 changes: 9 additions & 8 deletions tamingllms/_build/html/_sources/markdown/toc.md
Original file line number Diff line number Diff line change
Expand Up @@ -105,14 +105,15 @@ Abstract: *The current discourse around Large Language Models (LLMs) tends to fo
- 6.5.1 Building a RAG Pipeline
- 6.5.2 Testing and Validation

## Chapter 7: Safety Concerns
- 7.1 Common Safety Issues
- 7.2 Implementation of Safety Guards
- 7.3 Content Filtering
- 7.4 Input Validation
- 7.5 Output Sanitization
- 7.6 Monitoring and Alerts
- 7.7 Best Practices
## Chapter 7: [Preference-based Alignment](https://www.souzatharsis.com/tamingLLMs/notebooks/alignment.html)
- 7.1 Introduction
- 7.2 From Raw Capabilities to Preference Alignment
- 7.3 On the Misalignment of Language Models
- 7.4 Aligning Language Models with Human Preferences
- 7.5 Supervised Fine-Tuning (SFT) for Model Alignment
- 7.6 Augmenting SFT with Human Preferences
- 7.7 Case Study: Aligning a Language Model to a Policy
- 7.8 Discussion

## Chapter 8: The Cost Factor
- 8.1 Understanding LLM Costs
Expand Down
7 changes: 3 additions & 4 deletions tamingllms/_build/html/_sources/notebooks/alignment.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -254,12 +254,11 @@
" 2. Training the model to assign higher probability to the chosen response\n",
" 3. Minimizing the KL divergence between the original and fine-tuned model to preserve general capabilities\n",
"\n",
"At a high-level DPO maximizes the probability of preferred output and minimize rejected output as defined in {eq}`dpo-loss`.\n",
"At a high-level DPO maximizes the probability of preferred output and minimize rejected output as defined in the following equation:\n",
"\n",
"```{math}\n",
":label: dpo-loss\n",
"\\begin{gather*}\n",
"\\mathcal{L}_{\\text{DPO}}(\\pi_\\theta; \\pi_\\text{ref}) = -\\mathbb{E}_{(x,y_w,y_l) \\sim \\mathcal{D}} \\left[\\log \\sigma \\left(\\beta \\underbrace{\\log \\frac{\\pi_\\theta(y_w | x)}{\\pi_\\text{ref}(y_w | x)}}_{\\color{green}\\text{preferred}} - \\beta \\underbrace{\\log \\frac{\\pi_\\theta(y_l | x)}{\\pi_\\text{ref}(y_l | x)}}_{\\color{red}\\text{rejected}}\\right)\\right]\n",
"```\n",
"\\end{gather*}\n",
"\n",
"This approach is more straightforward than PPO, as it avoids the need for a reward model and instead uses a direct comparison of model outputs against human preferences.\n",
"\n",
Expand Down
6 changes: 3 additions & 3 deletions tamingllms/_build/html/markdown/intro.html
Original file line number Diff line number Diff line change
Expand Up @@ -208,7 +208,7 @@
<hr>
<div class="content" role="main" v-pre>

<section class="tex2jax_ignore mathjax_ignore" id="introduction">
<section id="introduction">
<span id="intro"></span><h1><a class="toc-backref" href="#id1" role="doc-backlink"><span class="section-number">1. </span>Introduction</a><a class="headerlink" href="#introduction" title="Permalink to this heading"></a></h1>
<blockquote class="epigraph">
<div><p>I am always doing that which I cannot do, in order that I may learn how to do it.</p>
Expand Down Expand Up @@ -286,7 +286,7 @@ <h2><a class="toc-backref" href="#id5" role="doc-backlink"><span class="section-
<li><p>Share their own experiences and solutions with the community</p></li>
<li><p>Propose new chapters or sections that address emerging challenges</p></li>
</ul>
<p>The repository can be found at <a class="reference external" href="https://github.com/souzatharsis/tamingllms">https://github.com/souzatharsis/tamingllms</a>. Whether you’ve found a typo, have a better solution to share, or want to contribute an entirely new section, your contributions are welcome.</p>
<p>The repository can be found at https://github.com/souzatharsis/tamingllms. Whether you’ve found a typo, have a better solution to share, or want to contribute an entirely new section, your contributions are welcome.</p>
</section>
<section id="a-note-on-perspective">
<h2><a class="toc-backref" href="#id6" role="doc-backlink"><span class="section-number">1.5. </span>A Note on Perspective</a><a class="headerlink" href="#a-note-on-perspective" title="Permalink to this heading"></a></h2>
Expand Down Expand Up @@ -399,7 +399,7 @@ <h3><a class="toc-backref" href="#id14" role="doc-backlink"><span class="section
<h2><a class="toc-backref" href="#id15" role="doc-backlink"><span class="section-number">1.10. </span>About the Author(s)</a><a class="headerlink" href="#about-the-author-s" title="Permalink to this heading"></a></h2>
<p>Dr. Tharsis Souza is a computer scientist and product leader specializing in AI-based products. He is a Lecturer at Columbia University’s Master of Science program in Applied Analytics, (<em>incoming</em>) Head of Product, Equities at Citadel, and former Senior VP at Two Sigma Investments. He also enjoys mentoring under-represented students &amp; working professionals to help create a more diverse global AI ecosystem.</p>
<p>With over 15 years of experience delivering technology products across startups and Fortune 500 companies, Dr. Souza is also an author of numerous scholarly publications and is a frequent speaker at academic and business conferences. Grounded on academic background and drawing from practical experience building and scaling up products powered by language models at early-stage startups, major institutions as well as advising non-profit organizations, and contributing to open source projects, he brings a unique perspective on bridging the gap between LLMs promised potential and their practical implementation challenges to enable the next generation of AI-powered products.</p>
<p>Dr. Tharsis holds a Ph.D. in Computer Science from UCL, University of London following an M.Phil. and <a class="reference external" href="http://M.Sc">M.Sc</a>. in Computer Science and a <a class="reference external" href="http://B.Sc">B.Sc</a>. in Computer Engineering.</p>
<p>Dr. Tharsis holds a Ph.D. in Computer Science from UCL, University of London following an M.Phil. and M.Sc. in Computer Science and a B.Sc. in Computer Engineering.</p>
</section>
</section>

Expand Down
21 changes: 11 additions & 10 deletions tamingllms/_build/html/markdown/toc.html
Original file line number Diff line number Diff line change
Expand Up @@ -180,7 +180,7 @@
<div class="content" role="main" v-pre>

<p>Sign-up to receive updates on <a class="reference external" href="https://tamingllm.substack.com/">new Chapters here</a>.</p>
<section class="tex2jax_ignore mathjax_ignore" id="taming-llms">
<section id="taming-llms">
<h1>Taming LLMs<a class="headerlink" href="#taming-llms" title="Permalink to this heading"></a></h1>
<section id="a-practical-guide-to-llm-pitfalls-with-open-source-software">
<h2><em>A Practical Guide to LLM Pitfalls with Open Source Software</em><a class="headerlink" href="#a-practical-guide-to-llm-pitfalls-with-open-source-software" title="Permalink to this heading"></a></h2>
Expand Down Expand Up @@ -336,16 +336,17 @@ <h2>Chapter 6: Hallucination: The Reality Gap<a class="headerlink" href="#chapte
</li>
</ul>
</section>
<section id="chapter-7-safety-concerns">
<h2>Chapter 7: Safety Concerns<a class="headerlink" href="#chapter-7-safety-concerns" title="Permalink to this heading"></a></h2>
<section id="chapter-7-preference-based-alignment">
<h2>Chapter 7: <a class="reference external" href="https://www.souzatharsis.com/tamingLLMs/notebooks/alignment.html">Preference-based Alignment</a><a class="headerlink" href="#chapter-7-preference-based-alignment" title="Permalink to this heading"></a></h2>
<ul class="simple">
<li><p>7.1 Common Safety Issues</p></li>
<li><p>7.2 Implementation of Safety Guards</p></li>
<li><p>7.3 Content Filtering</p></li>
<li><p>7.4 Input Validation</p></li>
<li><p>7.5 Output Sanitization</p></li>
<li><p>7.6 Monitoring and Alerts</p></li>
<li><p>7.7 Best Practices</p></li>
<li><p>7.1 Introduction</p></li>
<li><p>7.2 From Raw Capabilities to Preference Alignment</p></li>
<li><p>7.3 On the Misalignment of Language Models</p></li>
<li><p>7.4 Aligning Language Models with Human Preferences</p></li>
<li><p>7.5 Supervised Fine-Tuning (SFT) for Model Alignment</p></li>
<li><p>7.6 Augmenting SFT with Human Preferences</p></li>
<li><p>7.7 Case Study: Aligning a Language Model to a Policy</p></li>
<li><p>7.8 Discussion</p></li>
</ul>
</section>
<section id="chapter-8-the-cost-factor">
Expand Down
15 changes: 8 additions & 7 deletions tamingllms/_build/html/notebooks/alignment.html
Original file line number Diff line number Diff line change
Expand Up @@ -29,8 +29,7 @@
<script src="../_static/design-tabs.js"></script>
<script>const THEBE_JS_URL = "https://unpkg.com/[email protected]/lib/index.js"; const thebe_selector = ".thebe,.cell"; const thebe_selector_input = "pre"; const thebe_selector_output = ".output, .cell_output"</script>
<script async="async" src="../_static/sphinx-thebe.js"></script>
<script>window.MathJax = {"options": {"processHtmlClass": "tex2jax_process|mathjax_process|math|output_area"}}</script>
<script defer="defer" src="https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-mml-chtml.js"></script>
<script async="async" src="https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-mml-chtml.js"></script>
<script type="module" src="https://cdn.jsdelivr.net/npm/[email protected]/dist/mermaid.esm.min.mjs"></script>
<script type="module" src="https://cdn.jsdelivr.net/npm/@mermaid-js/[email protected]/dist/mermaid-layout-elk.esm.min.mjs"></script>
<script type="module">import mermaid from "https://cdn.jsdelivr.net/npm/[email protected]/dist/mermaid.esm.min.mjs";import elkLayouts from "https://cdn.jsdelivr.net/npm/@mermaid-js/[email protected]/dist/mermaid-layout-elk.esm.min.mjs";mermaid.registerLayoutLoaders(elkLayouts);mermaid.initialize({startOnLoad:false});</script>
Expand Down Expand Up @@ -203,7 +202,7 @@
<hr>
<div class="content" role="main" v-pre>

<section class="tex2jax_ignore mathjax_ignore" id="preference-based-alignment">
<section id="preference-based-alignment">
<h1><a class="toc-backref" href="#id126" role="doc-backlink"><span class="section-number">5. </span>Preference-Based Alignment</a><a class="headerlink" href="#preference-based-alignment" title="Permalink to this heading"></a></h1>
<blockquote class="epigraph">
<div><p>Move fast and be responsible.</p>
Expand Down Expand Up @@ -431,9 +430,11 @@ <h4><a class="toc-backref" href="#id132" role="doc-backlink"><span class="sectio
<li><p>Training the model to assign higher probability to the chosen response</p></li>
<li><p>Minimizing the KL divergence between the original and fine-tuned model to preserve general capabilities</p></li>
</ol>
<p>At a high-level DPO maximizes the probability of preferred output and minimize rejected output as defined in <a class="reference internal" href="#equation-dpo-loss">(5.1)</a>.</p>
<div class="math notranslate nohighlight" id="equation-dpo-loss">
<span class="eqno">(5.1)<a class="headerlink" href="#equation-dpo-loss" title="Permalink to this equation"></a></span>\[\mathcal{L}_{\text{DPO}}(\pi_\theta; \pi_\text{ref}) = -\mathbb{E}_{(x,y_w,y_l) \sim \mathcal{D}} \left[\log \sigma \left(\beta \underbrace{\log \frac{\pi_\theta(y_w | x)}{\pi_\text{ref}(y_w | x)}}_{\color{green}\text{preferred}} - \beta \underbrace{\log \frac{\pi_\theta(y_l | x)}{\pi_\text{ref}(y_l | x)}}_{\color{red}\text{rejected}}\right)\right]\]</div>
<p>At a high-level DPO maximizes the probability of preferred output and minimize rejected output as defined in the following equation:</p>
<div class="amsmath math notranslate nohighlight">
\[\begin{gather*}
\mathcal{L}_{\text{DPO}}(\pi_\theta; \pi_\text{ref}) = -\mathbb{E}_{(x,y_w,y_l) \sim \mathcal{D}} \left[\log \sigma \left(\beta \underbrace{\log \frac{\pi_\theta(y_w | x)}{\pi_\text{ref}(y_w | x)}}_{\color{green}\text{preferred}} - \beta \underbrace{\log \frac{\pi_\theta(y_l | x)}{\pi_\text{ref}(y_l | x)}}_{\color{red}\text{rejected}}\right)\right]
\end{gather*}\]</div>
<p>This approach is more straightforward than PPO, as it avoids the need for a reward model and instead uses a direct comparison of model outputs against human preferences.</p>
<p>Modern libraries such as HuggingFace’s TRL <span id="id21">[<a class="reference internal" href="#id125" title="Hugging Face. Trl. 2024d. TRL. URL: https://huggingface.co/docs/trl/en/index.">Face, 2024d</a>]</span> offer a suite of techniques for fine-tuning language models with reinforcement learning, including PPO, and DPO. It provides a user-friendly interface and a wide range of features for fine-tuning and aligning LLMs, which will be the focus of the next section as we go through a case study.</p>
</section>
Expand Down Expand Up @@ -853,7 +854,7 @@ <h4><a class="toc-backref" href="#id141" role="doc-backlink"><span class="sectio
</div>
<p>Recall our base model is <code class="docutils literal notranslate"><span class="pre">HuggingFaceTB/SmolLM2-360M-Instruct</span></code>. Here, we will use the HuggingFace Inference API to generate rejected responses from a cloud endpoint for enhanced performance:</p>
<ol class="arabic simple">
<li><p>Visit the HuggingFace Endpoints UI: <a class="reference external" href="https://ui.endpoints.huggingface.co/">https://ui.endpoints.huggingface.co/</a></p></li>
<li><p>Visit the HuggingFace Endpoints UI: https://ui.endpoints.huggingface.co/</p></li>
<li><p>Click “New Endpoint” and select the model <code class="docutils literal notranslate"><span class="pre">HuggingFaceTB/SmolLM2-360M-Instruct</span></code></p></li>
<li><p>Choose the compute resources (e.g., CPU or GPU instance, GPU preferred)</p></li>
<li><p>Configure the endpoint settings:</p>
Expand Down
2 changes: 1 addition & 1 deletion tamingllms/_build/html/notebooks/evals.html
Original file line number Diff line number Diff line change
Expand Up @@ -210,7 +210,7 @@
<hr>
<div class="content" role="main" v-pre>

<section class="tex2jax_ignore mathjax_ignore" id="the-evals-gap">
<section id="the-evals-gap">
<h1><a class="toc-backref" href="#id120" role="doc-backlink"><span class="section-number">4. </span>The Evals Gap</a><a class="headerlink" href="#the-evals-gap" title="Permalink to this heading"></a></h1>
<blockquote class="epigraph">
<div><p>It doesn’t matter how beautiful your theory is, <br>
Expand Down
2 changes: 1 addition & 1 deletion tamingllms/_build/html/notebooks/output_size_limit.html
Original file line number Diff line number Diff line change
Expand Up @@ -202,7 +202,7 @@
<hr>
<div class="content" role="main" v-pre>

<section class="tex2jax_ignore mathjax_ignore" id="output-size-limitations">
<section id="output-size-limitations">
<h1><a class="toc-backref" href="#id85" role="doc-backlink"><span class="section-number">2. </span>Output Size Limitations</a><a class="headerlink" href="#output-size-limitations" title="Permalink to this heading"></a></h1>
<blockquote class="epigraph">
<div><p>Only those who will risk going too far can possibly find out how far one can go.</p>
Expand Down
Loading

0 comments on commit 1e6ce32

Please sign in to comment.