Skip to content

Commit

Permalink
add WIP note
Browse files Browse the repository at this point in the history
  • Loading branch information
souzatharsis committed Dec 28, 2024
1 parent 568a08f commit 2b3d899
Show file tree
Hide file tree
Showing 14 changed files with 43 additions and 8 deletions.
Binary file modified tamingllms/_build/.doctrees/environment.pickle
Binary file not shown.
Binary file modified tamingllms/_build/.doctrees/notebooks/cost.doctree
Binary file not shown.
Binary file modified tamingllms/_build/.doctrees/notebooks/input.doctree
Binary file not shown.
4 changes: 4 additions & 0 deletions tamingllms/_build/html/_sources/notebooks/cost.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,10 @@
"-- William Stanley Jevons\n",
"```\n",
"```{contents}\n",
"```\n",
"\n",
"```{note}\n",
"This Chapter is Work-in-Progress.\n",
"```"
]
},
Expand Down
5 changes: 5 additions & 0 deletions tamingllms/_build/html/_sources/notebooks/input.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,11 @@
"-- Steve Jobs\n",
"```\n",
"```{contents}\n",
"```\n",
"\n",
"\n",
"```{note}\n",
"This Chapter is Work-in-Progress.\n",
"```"
]
},
Expand Down
4 changes: 4 additions & 0 deletions tamingllms/_build/html/notebooks/cost.html
Original file line number Diff line number Diff line change
Expand Up @@ -278,6 +278,10 @@
</li>
</ul>
</nav>
<div class="admonition note">
<p class="admonition-title">Note</p>
<p>This Chapter is Work-in-Progress.</p>
</div>
<section id="why-optimization-matters-more-than-ever">
<h2><a class="toc-backref" href="#id202" role="doc-backlink"><span class="section-number">9.1. </span>Why Optimization Matters More Than Ever</a><a class="headerlink" href="#why-optimization-matters-more-than-ever" title="Permalink to this heading"></a></h2>
<p>According to recent analysis from a16z <span id="id1">[<a class="reference internal" href="#id97" title="Andreessen Horowitz. Llmflation: understanding and mitigating llm inference cost. Blog Post, 2024. Analysis of LLM inference costs and strategies for optimization. URL: https://a16z.com/llmflation-llm-inference-cost/.">Andreessen Horowitz, 2024</a>]</span>, the cost of LLM inference is decreasing by approximately 10x every year - a rate that outpaces even Moore’s Law in the PC revolution or Edholm’s Law during the bandwidth explosion of the dot-com era.</p>
Expand Down
12 changes: 8 additions & 4 deletions tamingllms/_build/html/notebooks/input.html
Original file line number Diff line number Diff line change
Expand Up @@ -295,6 +295,10 @@
</li>
</ul>
</nav>
<div class="admonition note">
<p class="admonition-title">Note</p>
<p>This Chapter is Work-in-Progress.</p>
</div>
<section id="introduction">
<h2><a class="toc-backref" href="#id207" role="doc-backlink"><span class="section-number">5.1. </span>Introduction</a><a class="headerlink" href="#introduction" title="Permalink to this heading"></a></h2>
<p>Large Language Models face several critical challenges in effectively processing input data. While advances in long-context language models (LCLMs) <span id="id1">[<a class="reference internal" href="#id101" title="Jinhyuk Lee, Anthony Chen, Zhuyun Dai, Dheeru Dua, Devendra Singh Sachan, Michael Boratko, Yi Luan, Sébastien M. R. Arnold, Vincent Perot, Siddharth Dalmia, Hexiang Hu, Xudong Lin, Panupong Pasupat, Aida Amini, Jeremy R. Cole, Sebastian Riedel, Iftekhar Naim, Ming-Wei Chang, and Kelvin Guu. Can long-context language models subsume retrieval, rag, sql, and more? 2024. URL: https://arxiv.org/abs/2406.13121, arXiv:2406.13121.">Lee <em>et al.</em>, 2024</a>]</span> have expanded the amount of information these systems can process simultaneously, significant challenges remain in managing and effectively utilizing extended inputs.</p>
Expand Down Expand Up @@ -1744,11 +1748,11 @@ <h4><a class="toc-backref" href="#id217" role="doc-backlink"><span class="sectio
<li><p><strong>Depth of Analysis</strong>: While the report covers a wide range of topics, the depth of analysis in certain sections may not be as comprehensive as a human expert’s evaluation. Some nuances and contextual factors might be overlooked by the LLM. Splitting the report into multiple parts helps in mitigating this issue.</p></li>
<li><p><strong>Chunking Strategy</strong>: The current approach splits the text into chunks based on size, which ensures that each chunk fits within the model’s token limit. However, this method may disrupt the logical flow of the document, as sections of interest might be split across multiple chunks. An alternative approach could be “structured” chunking, where the text is divided based on meaningful sections or topics. This would preserve the coherence of each section, making it easier to follow and understand. Implementing structured chunking requires additional preprocessing to identify and segment the text appropriately, but it can significantly enhance the readability and logical flow of the generated report.</p></li>
</ul>
<p>Here, we implemented a simple strategy to improve the coherence in output generation given a multi-part chunked input. Many other strategies are possible. One related technique worth mentioning is Anthropic’s Contextual Retrieval <span id="id11">[<a class="reference internal" href="#id144" title="Anthropic. Introducing contextual retrieval. 09 2024. URL: https://www.anthropic.com/news/contextual-retrieval.">Anthropic, 2024</a>]</span>. The approach, as shown in <a class="reference internal" href="#anth-contextual"><span class="std std-numref">Fig. 5.7</span></a>, employs an LLM itself to generate relevant context per chunk before passing these two pieces of information together to the LLM. This process was proposed in the context of RAGs to enhance its retrieval capabilities but can be applied more generally to improve output generation.</p>
<p>Here, we implemented a simple strategy to improve the coherence in output generation given a multi-part chunked input. Many other strategies are possible. One related technique worth mentioning is Anthropic’s Contextual Retrieval <span id="id11">[<a class="reference internal" href="#id144" title="Anthropic. Introducing contextual retrieval. 09 2024a. URL: https://www.anthropic.com/news/contextual-retrieval.">Anthropic, 2024a</a>]</span>. The approach, as shown in <a class="reference internal" href="#anth-contextual"><span class="std std-numref">Fig. 5.7</span></a>, employs an LLM itself to generate relevant context per chunk before passing these two pieces of information together to the LLM. This process was proposed in the context of RAGs to enhance its retrieval capabilities but can be applied more generally to improve output generation.</p>
<figure class="align-center" id="anth-contextual">
<a class="reference internal image-reference" href="../_images/anth_contextual.png"><img alt="Anthropic Contextual Linking" src="../_images/anth_contextual.png" style="width: 545.5px; height: 359.0px;" /></a>
<figcaption>
<p><span class="caption-number">Fig. 5.7 </span><span class="caption-text">Anthropic Contextual Linking <span id="id12">[<a class="reference internal" href="#id144" title="Anthropic. Introducing contextual retrieval. 09 2024. URL: https://www.anthropic.com/news/contextual-retrieval.">Anthropic, 2024</a>]</span>.</span><a class="headerlink" href="#anth-contextual" title="Permalink to this image"></a></p>
<p><span class="caption-number">Fig. 5.7 </span><span class="caption-text">Anthropic Contextual Linking <span id="id12">[<a class="reference internal" href="#id144" title="Anthropic. Introducing contextual retrieval. 09 2024a. URL: https://www.anthropic.com/news/contextual-retrieval.">Anthropic, 2024a</a>]</span>.</span><a class="headerlink" href="#anth-contextual" title="Permalink to this image"></a></p>
</figcaption>
</figure>
</section>
Expand Down Expand Up @@ -2043,9 +2047,9 @@ <h2><a class="toc-backref" href="#id225" role="doc-backlink"><span class="sectio
<p>Yujia Zhou, Zheng Liu, Jiajie Jin, Jian-Yun Nie, and Zhicheng Dou. Metacognitive retrieval-augmented large language models. In <em>Proceedings of the ACM Web Conference 2024</em>, WWW '24, 1453–1463. New York, NY, USA, 2024. Association for Computing Machinery. URL: <a class="reference external" href="https://doi.org/10.1145/3589334.3645481">https://doi.org/10.1145/3589334.3645481</a>, <a class="reference external" href="https://doi.org/10.1145/3589334.3645481">doi:10.1145/3589334.3645481</a>.</p>
</div>
<div class="citation" id="id144" role="doc-biblioentry">
<span class="label"><span class="fn-bracket">[</span>Anthropic24<span class="fn-bracket">]</span></span>
<span class="label"><span class="fn-bracket">[</span>Anthropic4a<span class="fn-bracket">]</span></span>
<span class="backrefs">(<a role="doc-backlink" href="#id11">1</a>,<a role="doc-backlink" href="#id12">2</a>)</span>
<p>Anthropic. Introducing contextual retrieval. 09 2024. URL: <a class="reference external" href="https://www.anthropic.com/news/contextual-retrieval">https://www.anthropic.com/news/contextual-retrieval</a>.</p>
<p>Anthropic. Introducing contextual retrieval. 09 2024a. URL: <a class="reference external" href="https://www.anthropic.com/news/contextual-retrieval">https://www.anthropic.com/news/contextual-retrieval</a>.</p>
</div>
<div class="citation" id="id51" role="doc-biblioentry">
<span class="label"><span class="fn-bracket">[</span><a role="doc-backlink" href="#id10">LangChain24</a><span class="fn-bracket">]</span></span>
Expand Down
2 changes: 1 addition & 1 deletion tamingllms/_build/html/searchindex.js

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion tamingllms/_build/jupyter_execute/markdown/intro.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
"cells": [
{
"cell_type": "markdown",
"id": "70924298",
"id": "65b8b461",
"metadata": {},
"source": [
"(intro)=\n",
Expand Down
4 changes: 4 additions & 0 deletions tamingllms/_build/jupyter_execute/notebooks/cost.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,10 @@
"-- William Stanley Jevons\n",
"```\n",
"```{contents}\n",
"```\n",
"\n",
"```{note}\n",
"This Chapter is Work-in-Progress.\n",
"```"
]
},
Expand Down
5 changes: 5 additions & 0 deletions tamingllms/_build/jupyter_execute/notebooks/input.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,11 @@
"-- Steve Jobs\n",
"```\n",
"```{contents}\n",
"```\n",
"\n",
"\n",
"```{note}\n",
"This Chapter is Work-in-Progress.\n",
"```"
]
},
Expand Down
6 changes: 5 additions & 1 deletion tamingllms/notebooks/cost.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -6,14 +6,18 @@
"source": [
"(cost)=\n",
"# The Falling Cost Paradox\n",
"\n",
"```{epigraph}\n",
"It is a confusion of ideas to suppose that the economical use of fuel is equivalent to diminished consumption. <br>\n",
"The very contrary is the truth. \n",
"\n",
"-- William Stanley Jevons\n",
"```\n",
"```{contents}\n",
"```"
"```\n",
"```{note}\n",
"This Chapter is Work-in-Progress.\n",
"```\n"
]
},
{
Expand Down
5 changes: 5 additions & 0 deletions tamingllms/notebooks/input.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,11 @@
"-- Steve Jobs\n",
"```\n",
"```{contents}\n",
"```\n",
"\n",
"\n",
"```{note}\n",
"This Chapter is Work-in-Progress.\n",
"```"
]
},
Expand Down
2 changes: 1 addition & 1 deletion tamingllms/references.bib
Original file line number Diff line number Diff line change
Expand Up @@ -1217,7 +1217,7 @@ @misc{tan2024htmlraghtmlbetterplain
@misc{anthropic2024contextualretrieval,
title={Introducing Contextual Retrieval},
author={{Anthropic}},
year={2024},
year={2024a},
month={09},
url={https://www.anthropic.com/news/contextual-retrieval}
}
Expand Down

0 comments on commit 2b3d899

Please sign in to comment.