add WIP note

souzatharsis · Dec 28, 2024 · 2b3d899 · 2b3d899
1 parent 568a08f
commit 2b3d899
Show file tree

Hide file tree

Showing 14 changed files with 43 additions and 8 deletions.
diff --git a/tamingllms/_build/.doctrees/environment.pickle b/tamingllms/_build/.doctrees/environment.pickle
diff --git a/tamingllms/_build/.doctrees/notebooks/cost.doctree b/tamingllms/_build/.doctrees/notebooks/cost.doctree
diff --git a/tamingllms/_build/.doctrees/notebooks/input.doctree b/tamingllms/_build/.doctrees/notebooks/input.doctree
diff --git a/tamingllms/_build/html/_sources/notebooks/cost.ipynb b/tamingllms/_build/html/_sources/notebooks/cost.ipynb
@@ -13,6 +13,10 @@
     "-- William Stanley Jevons\n",
     "```\n",
     "```{contents}\n",
+    "```\n",
+    "\n",
+    "```{note}\n",
+    "This Chapter is Work-in-Progress.\n",
     "```"
    ]
   },

diff --git a/tamingllms/_build/html/_sources/notebooks/input.ipynb b/tamingllms/_build/html/_sources/notebooks/input.ipynb
@@ -12,6 +12,11 @@
     "-- Steve Jobs\n",
     "```\n",
     "```{contents}\n",
+    "```\n",
+    "\n",
+    "\n",
+    "```{note}\n",
+    "This Chapter is Work-in-Progress.\n",
     "```"
    ]
   },

diff --git a/tamingllms/_build/html/notebooks/cost.html b/tamingllms/_build/html/notebooks/cost.html
@@ -278,6 +278,10 @@
 </li>
 </ul>
 </nav>
+<div class="admonition note">
+<p class="admonition-title">Note</p>
+<p>This Chapter is Work-in-Progress.</p>
+</div>
 <section id="why-optimization-matters-more-than-ever">
 <h2><a class="toc-backref" href="#id202" role="doc-backlink"><span class="section-number">9.1. </span>Why Optimization Matters More Than Ever</a><a class="headerlink" href="#why-optimization-matters-more-than-ever" title="Permalink to this heading">¶</a></h2>
 <p>According to recent analysis from a16z <span id="id1">[<a class="reference internal" href="#id97" title="Andreessen Horowitz. Llmflation: understanding and mitigating llm inference cost. Blog Post, 2024. Analysis of LLM inference costs and strategies for optimization. URL: https://a16z.com/llmflation-llm-inference-cost/.">Andreessen Horowitz, 2024</a>]</span>, the cost of LLM inference is decreasing by approximately 10x every year - a rate that outpaces even Moore’s Law in the PC revolution or Edholm’s Law during the bandwidth explosion of the dot-com era.</p>

diff --git a/tamingllms/_build/html/notebooks/input.html b/tamingllms/_build/html/notebooks/input.html
@@ -295,6 +295,10 @@
 </li>
 </ul>
 </nav>
+<div class="admonition note">
+<p class="admonition-title">Note</p>
+<p>This Chapter is Work-in-Progress.</p>
+</div>
 <section id="introduction">
 <h2><a class="toc-backref" href="#id207" role="doc-backlink"><span class="section-number">5.1. </span>Introduction</a><a class="headerlink" href="#introduction" title="Permalink to this heading">¶</a></h2>
 <p>Large Language Models face several critical challenges in effectively processing input data. While advances in long-context language models (LCLMs) <span id="id1">[<a class="reference internal" href="#id101" title="Jinhyuk Lee, Anthony Chen, Zhuyun Dai, Dheeru Dua, Devendra Singh Sachan, Michael Boratko, Yi Luan, Sébastien M. R. Arnold, Vincent Perot, Siddharth Dalmia, Hexiang Hu, Xudong Lin, Panupong Pasupat, Aida Amini, Jeremy R. Cole, Sebastian Riedel, Iftekhar Naim, Ming-Wei Chang, and Kelvin Guu. Can long-context language models subsume retrieval, rag, sql, and more? 2024. URL: https://arxiv.org/abs/2406.13121, arXiv:2406.13121.">Lee <em>et al.</em>, 2024</a>]</span> have expanded the amount of information these systems can process simultaneously, significant challenges remain in managing and effectively utilizing extended inputs.</p>
@@ -1744,11 +1748,11 @@ <h4><a class="toc-backref" href="#id217" role="doc-backlink"><span class="sectio
 <li><p><strong>Depth of Analysis</strong>: While the report covers a wide range of topics, the depth of analysis in certain sections may not be as comprehensive as a human expert’s evaluation. Some nuances and contextual factors might be overlooked by the LLM. Splitting the report into multiple parts helps in mitigating this issue.</p></li>
 <li><p><strong>Chunking Strategy</strong>: The current approach splits the text into chunks based on size, which ensures that each chunk fits within the model’s token limit. However, this method may disrupt the logical flow of the document, as sections of interest might be split across multiple chunks. An alternative approach could be “structured” chunking, where the text is divided based on meaningful sections or topics. This would preserve the coherence of each section, making it easier to follow and understand. Implementing structured chunking requires additional preprocessing to identify and segment the text appropriately, but it can significantly enhance the readability and logical flow of the generated report.</p></li>
 </ul>
-<p>Here, we implemented a simple strategy to improve the coherence in output generation given a multi-part chunked input. Many other strategies are possible. One related technique worth mentioning is Anthropic’s Contextual Retrieval <span id="id11">[<a class="reference internal" href="#id144" title="Anthropic. Introducing contextual retrieval. 09 2024. URL: https://www.anthropic.com/news/contextual-retrieval.">Anthropic, 2024</a>]</span>. The approach, as shown in <a class="reference internal" href="#anth-contextual"><span class="std std-numref">Fig. 5.7</span></a>, employs an LLM itself to generate relevant context per chunk before passing these two pieces of information together to the LLM. This process was proposed in the context of RAGs to enhance its retrieval capabilities but can be applied more generally to improve output generation.</p>
+<p>Here, we implemented a simple strategy to improve the coherence in output generation given a multi-part chunked input. Many other strategies are possible. One related technique worth mentioning is Anthropic’s Contextual Retrieval <span id="id11">[<a class="reference internal" href="#id144" title="Anthropic. Introducing contextual retrieval. 09 2024a. URL: https://www.anthropic.com/news/contextual-retrieval.">Anthropic, 2024a</a>]</span>. The approach, as shown in <a class="reference internal" href="#anth-contextual"><span class="std std-numref">Fig. 5.7</span></a>, employs an LLM itself to generate relevant context per chunk before passing these two pieces of information together to the LLM. This process was proposed in the context of RAGs to enhance its retrieval capabilities but can be applied more generally to improve output generation.</p>
 <figure class="align-center" id="anth-contextual">
 <a class="reference internal image-reference" href="../_images/anth_contextual.png"><img alt="Anthropic Contextual Linking" src="../_images/anth_contextual.png" style="width: 545.5px; height: 359.0px;" /></a>
 <figcaption>
-<p><span class="caption-number">Fig. 5.7 </span><span class="caption-text">Anthropic Contextual Linking <span id="id12">[<a class="reference internal" href="#id144" title="Anthropic. Introducing contextual retrieval. 09 2024. URL: https://www.anthropic.com/news/contextual-retrieval.">Anthropic, 2024</a>]</span>.</span><a class="headerlink" href="#anth-contextual" title="Permalink to this image">¶</a></p>
+<p><span class="caption-number">Fig. 5.7 </span><span class="caption-text">Anthropic Contextual Linking <span id="id12">[<a class="reference internal" href="#id144" title="Anthropic. Introducing contextual retrieval. 09 2024a. URL: https://www.anthropic.com/news/contextual-retrieval.">Anthropic, 2024a</a>]</span>.</span><a class="headerlink" href="#anth-contextual" title="Permalink to this image">¶</a></p>
 </figcaption>
 </figure>
 </section>
@@ -2043,9 +2047,9 @@ <h2><a class="toc-backref" href="#id225" role="doc-backlink"><span class="sectio
 <p>Yujia Zhou, Zheng Liu, Jiajie Jin, Jian-Yun Nie, and Zhicheng Dou. Metacognitive retrieval-augmented large language models. In <em>Proceedings of the ACM Web Conference 2024</em>, WWW '24, 1453–1463. New York, NY, USA, 2024. Association for Computing Machinery. URL: <a class="reference external" href="https://doi.org/10.1145/3589334.3645481">https://doi.org/10.1145/3589334.3645481</a>, <a class="reference external" href="https://doi.org/10.1145/3589334.3645481">doi:10.1145/3589334.3645481</a>.</p>
 </div>
 <div class="citation" id="id144" role="doc-biblioentry">
-<span class="label"><span class="fn-bracket">[</span>Anthropic24<span class="fn-bracket">]</span></span>
+<span class="label"><span class="fn-bracket">[</span>Anthropic4a<span class="fn-bracket">]</span></span>
 <span class="backrefs">(<a role="doc-backlink" href="#id11">1</a>,<a role="doc-backlink" href="#id12">2</a>)</span>
-<p>Anthropic. Introducing contextual retrieval. 09 2024. URL: <a class="reference external" href="https://www.anthropic.com/news/contextual-retrieval">https://www.anthropic.com/news/contextual-retrieval</a>.</p>
+<p>Anthropic. Introducing contextual retrieval. 09 2024a. URL: <a class="reference external" href="https://www.anthropic.com/news/contextual-retrieval">https://www.anthropic.com/news/contextual-retrieval</a>.</p>
 </div>
 <div class="citation" id="id51" role="doc-biblioentry">
 <span class="label"><span class="fn-bracket">[</span><a role="doc-backlink" href="#id10">LangChain24</a><span class="fn-bracket">]</span></span>

diff --git a/tamingllms/_build/html/searchindex.js b/tamingllms/_build/html/searchindex.js
diff --git a/tamingllms/_build/jupyter_execute/markdown/intro.ipynb b/tamingllms/_build/jupyter_execute/markdown/intro.ipynb
@@ -2,7 +2,7 @@
  "cells": [
   {
    "cell_type": "markdown",
-   "id": "70924298",
+   "id": "65b8b461",
    "metadata": {},
    "source": [
     "(intro)=\n",

diff --git a/tamingllms/_build/jupyter_execute/notebooks/cost.ipynb b/tamingllms/_build/jupyter_execute/notebooks/cost.ipynb
@@ -13,6 +13,10 @@
     "-- William Stanley Jevons\n",
     "```\n",
     "```{contents}\n",
+    "```\n",
+    "\n",
+    "```{note}\n",
+    "This Chapter is Work-in-Progress.\n",
     "```"
    ]
   },

diff --git a/tamingllms/_build/jupyter_execute/notebooks/input.ipynb b/tamingllms/_build/jupyter_execute/notebooks/input.ipynb
@@ -12,6 +12,11 @@
     "-- Steve Jobs\n",
     "```\n",
     "```{contents}\n",
+    "```\n",
+    "\n",
+    "\n",
+    "```{note}\n",
+    "This Chapter is Work-in-Progress.\n",
     "```"
    ]
   },

diff --git a/tamingllms/notebooks/cost.ipynb b/tamingllms/notebooks/cost.ipynb
@@ -6,14 +6,18 @@
    "source": [
     "(cost)=\n",
     "# The Falling Cost Paradox\n",
+    "\n",
     "```{epigraph}\n",
     "It is a confusion of ideas to suppose that the economical use of fuel is equivalent to diminished consumption. <br>\n",
     "The very contrary is the truth. \n",
     "\n",
     "-- William Stanley Jevons\n",
     "```\n",
     "```{contents}\n",
-    "```"
+    "```\n",
+    "```{note}\n",
+    "This Chapter is Work-in-Progress.\n",
+    "```\n"
    ]
   },
   {

diff --git a/tamingllms/notebooks/input.ipynb b/tamingllms/notebooks/input.ipynb
@@ -12,6 +12,11 @@
     "-- Steve Jobs\n",
     "```\n",
     "```{contents}\n",
+    "```\n",
+    "\n",
+    "\n",
+    "```{note}\n",
+    "This Chapter is Work-in-Progress.\n",
     "```"
    ]
   },

diff --git a/tamingllms/references.bib b/tamingllms/references.bib
@@ -1217,7 +1217,7 @@ @misc{tan2024htmlraghtmlbetterplain
 @misc{anthropic2024contextualretrieval,
       title={Introducing Contextual Retrieval}, 
       author={{Anthropic}},
-      year={2024},
+      year={2024a},
       month={09},
       url={https://www.anthropic.com/news/contextual-retrieval}
 }