Skip to content

Commit

Permalink
add corrections to input and output chapters
Browse files Browse the repository at this point in the history
  • Loading branch information
souzatharsis committed Jan 10, 2025
1 parent fcd39b7 commit e6f1bbb
Show file tree
Hide file tree
Showing 63 changed files with 7,934 additions and 1,256 deletions.
3 changes: 3 additions & 0 deletions TESTIMONIALS.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,3 +3,6 @@

> This is amazing content, thank you so much for sharing!!!
-- Didier Lopes, Founder of OpenBB

> Easily one of the best resources on structured generation so far. Great resource for the AI engineer in your life!
-- Cameron, Outlines, .txt
Binary file modified tamingllms/_build/.doctrees/environment.pickle
Binary file not shown.
Binary file modified tamingllms/_build/.doctrees/markdown/preface.doctree
Binary file not shown.
Binary file modified tamingllms/_build/.doctrees/markdown/toc.doctree
Binary file not shown.
Binary file modified tamingllms/_build/.doctrees/notebooks/alignment.doctree
Binary file not shown.
Binary file modified tamingllms/_build/.doctrees/notebooks/cost.doctree
Binary file not shown.
Binary file modified tamingllms/_build/.doctrees/notebooks/evals.doctree
Binary file not shown.
Binary file modified tamingllms/_build/.doctrees/notebooks/input.doctree
Binary file not shown.
Binary file modified tamingllms/_build/.doctrees/notebooks/local.doctree
Binary file not shown.
Binary file modified tamingllms/_build/.doctrees/notebooks/safety.doctree
Binary file not shown.
Binary file modified tamingllms/_build/.doctrees/notebooks/structured_output.doctree
Binary file not shown.
1 change: 0 additions & 1 deletion tamingllms/_build/html/_images/design.svg

This file was deleted.

Binary file added tamingllms/_build/html/_images/embedding.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
118 changes: 0 additions & 118 deletions tamingllms/_build/html/_images/embedding.svg

This file was deleted.

2 changes: 1 addition & 1 deletion tamingllms/_build/html/_images/incontext.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added tamingllms/_build/html/_images/rag.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4 changes: 0 additions & 4 deletions tamingllms/_build/html/_images/rag.svg

This file was deleted.

37 changes: 15 additions & 22 deletions tamingllms/_build/html/_sources/markdown/toc.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,6 @@ author: "Tharsis T. P. Souza"
date: "2024-12-16"
---

Sign-up to receive updates on [new Chapters here](https://tamingllm.substack.com/).

<a href="https://www.tamingllms.com" target="_blank">
<img src="../_static/cover_curve.png" style="background-color:white; width:50%;" alt="Taming LLMs Cover" />
</a>
Expand All @@ -16,27 +14,22 @@ Sign-up to receive updates on [new Chapters here](https://tamingllm.substack.com

Abstract: *The current discourse around Large Language Models (LLMs) tends to focus heavily on their capabilities while glossing over fundamental challenges. Conversely, this book takes a critical look at the key limitations and implementation pitfalls that engineers and technical leaders encounter when building LLM-powered applications. Through practical Python examples and proven open source solutions, it provides an introductory yet comprehensive guide for navigating these challenges. The focus is on concrete problems with reproducible code examples and battle-tested open source tools. By understanding these pitfalls upfront, readers will be better equipped to build products that harness the power of LLMs while sidestepping their inherent limitations.*

## [Preface](https://www.tamingllms.com/markdown/preface.html)

## [About the Book](https://www.tamingllms.com/markdown/intro.html)

## [Chapter 1: The Evals Gap](https://www.tamingllms.com/notebooks/evals.html)

## [Chapter 2: Structured Output](https://www.tamingllms.com/notebooks/structured_output.html)

## [Chapter 3: Managing Input Data](https://www.tamingllms.com/notebooks/input.html)

## [Chapter 4: Safety](https://www.tamingllms.com/notebooks/safety.html)

## [Chapter 5: Preference-Based Alignment](https://www.tamingllms.com/notebooks/alignment.html)

## [Chapter 6: Local LLMs in Practice](https://www.tamingllms.com/notebooks/local.html)

## Chapter 7: The Falling Cost Paradox

## Chapter 8: Frontiers
(*) *The pdf version is preferred as it contains corrections and side notes.*

| Chapter (*) | PDF | Podcast | Website | Notebook | Status |
|:-------------------------------------------|--------------|--------------|--------------|---------------|----------------------|
| **Preface** | | | [html](https://www.tamingllms.com/markdown/preface.html) | N/A | *Ready for Review* |
| **About the Book** | | | [html](https://www.tamingllms.com/markdown/intro.html) | N/A | *Ready for Review* |
| **Chapter 1: The Evals Gap** | [pdf](https://www.dropbox.com/scl/fi/voyhpqp0glkhijopyev71/DRAFT_Chapter-1-The-Evals-Gap.pdf?rlkey=ehzf6g4ngsssuoe471on8itu4&st=zqv98w2n&dl=0) | [podcast](https://tamingllm.substack.com/p/chapter-1-podcast-the-evals-gap) | [html](https://www.tamingllms.com/notebooks/evals.html) | [ipynb](https://github.com/souzatharsis/tamingLLMs/blob/master/tamingllms/notebooks/evals.ipynb) | *Ready for Review* |
| **Chapter 2: Structured Output**| [pdf](https://www.dropbox.com/scl/fi/x3a84bm1ewcfemj4p7b5p/DRAFT_Chapter-2-Structured-Output.pdf?rlkey=zysw6mat7har133rs7am7bb8n&st=4ns4ak24&dl=0) | [podcast](https://tamingllm.substack.com/p/chapter-2-podcast-structured-output) | [html](https://www.tamingllms.com/notebooks/structured_output.html) | [ipynb](https://github.com/souzatharsis/tamingLLMs/blob/master/tamingllms/notebooks/structured_output.ipynb) | *Ready for Review* |
| **Chapter 3: Managing Input Data** | | | [html](https://www.tamingllms.com/notebooks/input.html) | [ipynb](https://github.com/souzatharsis/tamingLLMs/blob/master/tamingllms/notebooks/input.ipynb) | |
| **Chapter 4: Safety** | | | [html](https://www.tamingllms.com/notebooks/safety.html) | [ipynb](https://github.com/souzatharsis/tamingLLMs/blob/master/tamingllms/notebooks/safety.ipynb) | |
| **Chapter 5: Preference-Based Alignment** | | | [html](https://www.tamingllms.com/notebooks/alignment.html) | [ipynb](https://github.com/souzatharsis/tamingLLMs/blob/master/tamingllms/notebooks/alignment.ipynb) | |
| **Chapter 6: Local LLMs in Practice** | | | [html](https://www.tamingllms.com/notebooks/local.html) | [ipynb](https://github.com/souzatharsis/tamingLLMs/blob/master/tamingllms/notebooks/local.ipynb) | |
| **Chapter 7: The Falling Cost Paradox** | | | | | WIP |
| **Chapter 8: Frontiers** | | | | | |
| **Appendix A: Tools and Resources** | | | | | |

## Appendix A: Tools and Resources


[![CC BY-NC-SA 4.0][cc-by-nc-sa-image]][cc-by-nc-sa]
Expand Down
33 changes: 17 additions & 16 deletions tamingllms/_build/html/_sources/notebooks/input.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -1703,7 +1703,7 @@
"\n",
"Data extraction, parsing and chunking are also part of a canonical pipeline as we prepare the knowledge base. Those are concepts we explored in detail in Sections {ref}`parsing` and {ref}`chunking`, hence we will be succinct here. We will start by preparing the knowledge base.\n",
"\n",
"```{figure} ../_static/input/rag.svg\n",
"```{figure} ../_static/input/rag.png\n",
"---\n",
"name: rag_pipeline\n",
"alt: RAG Pipeline\n",
Expand Down Expand Up @@ -1872,24 +1872,23 @@
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 1,
"metadata": {},
"outputs": [],
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[['intro', 'input', 'structured_output']]\n"
]
}
],
"source": [
"q = \"What is the purpose of this book?\"\n",
"res = query_collection(collection, q)\n",
"res.get(\"ids\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"print([['intro', 'input', 'structured_output']])"
]
},
{
"cell_type": "markdown",
"metadata": {},
Expand Down Expand Up @@ -1920,7 +1919,7 @@
"\n",
"Behind the scenes, ChromaDB is using the model `all-MiniLM-L6-v2` by default [^chroma_embeddings] to create embeddings for the input documents and the query (see {numref}`embedding`). This model is available in `sentence_transformers` {cite}`sentencetransformers2024website`. Let's see how it works.\n",
"\n",
"```{figure} ../_static/input/embedding.svg\n",
"```{figure} ../_static/input/embedding.png\n",
"---\n",
"name: embedding\n",
"alt: Embedding\n",
Expand Down Expand Up @@ -2860,7 +2859,7 @@
"outputs": [],
"source": [
"# Save the generated report to a local file\n",
"with open('data/apple_report.txt', 'w') as file:\n",
"with open('data/apple_report.md', 'w') as file:\n",
" file.write(report)\n"
]
},
Expand Down Expand Up @@ -2926,7 +2925,7 @@
],
"source": [
"# Read and display the generated report\n",
"with open('../data/apple_report.txt', 'r') as file:\n",
"with open('../data/apple_report.md', 'r') as file:\n",
" report_content = file.read()\n",
" \n",
"from IPython.display import Markdown\n",
Expand Down Expand Up @@ -2985,7 +2984,9 @@
"source": [
"### Case Study II: Quiz Generation with Citations\n",
"\n",
"In this case study, we will build a Quiz generator with citations that explores additional input management techniques particularly useful with long context windows. The implementation includes prompt caching for efficiency and citation tracking to enhance accuracy and verifiability. We will use Gemini 1.5 Pro as our LLM model, which has a context window of 2M tokens.\n",
"This case study is motivated by the rise of long-context models (LCs). Readers are encouraged to consider leveraging long-context windows if suitable to application requirements instead of defaulting to a RAGs-based approach given the reasons we have discussed in previous sections where we go over RAGs limitations and trade-offs in relation with LCs.\n",
"\n",
"In this case study, we will build a Quiz generator with citations that explores additional input management techniques particularly useful with long context windows. The implementation includes prompt caching for efficiency and citation tracking to enhance accuracy and verifiability. We will use Gemini 1.5 Pro (experimental) as our LLM, which has a context window of 2M tokens.\n",
"\n",
"#### Use Case\n",
"\n",
Expand Down
Loading

0 comments on commit e6f1bbb

Please sign in to comment.