update reranking doc with local nim instructions

langchain-ai · Apr 30, 2024 · 7905ef4 · 7905ef4
1 parent c23e707
commit 7905ef4
Showing 1 changed file with 52 additions and 8 deletions.
diff --git a/libs/ai-endpoints/docs/retrievers/nvidia_rerank.ipynb b/libs/ai-endpoints/docs/retrievers/nvidia_rerank.ipynb
@@ -17,7 +17,16 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "## Combining results from multiple sources\n",
+    "## Working with NVIDIA NIMs\n",
+    "\n",
+    "[ai.nvidia.com](http://ai.nvidia.com) hosts a variety of AI models accessible with an api key and the `langchain-nvidia-ai-endpoints` library. The use cases below operate in this mode by default."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Combining results from multiple sources\n",
     "\n",
     "Consider a pipeline with data from a semantic store, such as FAISS, as well as a BM25 store.\n",
     "\n",
@@ -52,7 +61,7 @@
     }
    },
    "source": [
-    "### BM25 relevant documents\n",
+    "#### BM25 relevant documents\n",
     "\n",
     "Below we assume you have ElasticSearch running with documents stored in a `langchain-index` store."
    ]
@@ -106,7 +115,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "### Semantic documents\n",
+    "#### Semantic documents\n",
     "\n",
     "Below we assume you have a saved FAISS index."
    ]
@@ -137,7 +146,7 @@
     "from langchain_community.vectorstores import FAISS\n",
     "from langchain_nvidia_ai_endpoints import NVIDIAEmbeddings\n",
     "\n",
-    "embeddings = NVIDIAEmbeddings()\n",
+    "embedder = NVIDIAEmbeddings()\n",
     "\n",
     "# De-serialization relies on loading a pickle file.\n",
     "# Pickle files can be modified to deliver a malicious payload that\n",
@@ -167,7 +176,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "### Combine and rank documents\n",
+    "#### Combine and rank documents\n",
     "\n",
     "The resulting `docs` will be ordered by their relevance to the query."
    ]
@@ -195,7 +204,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "## Enhancing accuracy for single data sources\n",
+    "### Enhancing accuracy for single data sources\n",
     "\n",
     "Semantic search with vector embeddings is an efficient way to turn a large corpus of documents into a smaller corpus of relevant documents. This is done by trading accuracy for efficiency. Reranking as a tool adds accuracy back into the search by post-processing the smaller corpus of documents. Typically, ranking on the full corpus is too slow for applications."
    ]
@@ -236,16 +245,51 @@
     "from langchain.vectorstores.pgvector import PGVector\n",
     "\n",
     "ranker = NVIDIARerank(top_n=10)\n",
-    "embeddings = NVIDIAEmbeddings()\n",
+    "embedder = NVIDIAEmbeddings()\n",
     "\n",
-    "store = PGVector(embeddings=embeddings,\n",
+    "store = PGVector(embeddings=embedder,\n",
     "                 collection_name=\"langchain-index\",\n",
     "                 connection=\"postgresql+psycopg://langchain:langchain@localhost:6024/langchain\")\n",
     "\n",
     "subset_docs = store.similarity_search(query, k=1_000)\n",
     "\n",
     "docs = ranker.compress_documents(query=query, documents=subset_docs)"
    ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Working with a local NIM\n",
+    "\n",
+    "[Learn more about NIMs](https://developer.nvidia.com/blog/nvidia-nim-offers-optimized-inference-microservices-for-deploying-ai-models-at-scale/)\n",
+    "\n",
+    "The `NVIDIAEmbeddings` and `NVIDIARerank` classes give you a way to work with local NIMs through `mode` switching."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "vscode": {
+     "languageId": "plaintext"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# connect to an embedding NIM running at localhost:2016\n",
+    "embedder = NVIDIAEmbeddings().mode(\"nim\", base_url=\"http://localhost:2016/v1\")\n",
+    "\n",
+    "# connect to a reranking NIM running at localhost:1976\n",
+    "ranker = NVIDIARerank().mode(\"nim\", base_url=\"http://localhost:1976/v1\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "You can rerun the examples above with this new `embedder` and `ranker`."
+   ]
   }
  ],
  "metadata": {