diff --git a/cookbook/README.md b/cookbook/README.md
index e147213125df4..a951857907013 100644
--- a/cookbook/README.md
+++ b/cookbook/README.md
@@ -36,6 +36,7 @@ Notebook | Description
 [llm_symbolic_math.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/llm_symbolic_math.ipynb) | Solve algebraic equations with the help of llms (language learning models) and sympy, a python library for symbolic mathematics.
 [meta_prompt.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/meta_prompt.ipynb) | Implement the meta-prompt concept, which is a method for building self-improving agents that reflect on their own performance and modify their instructions accordingly.
 [multi_modal_output_agent.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/multi_modal_output_agent.ipynb) | Generate multi-modal outputs, specifically images and text.
+[multi_modal_RAG_vdms.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/multi_modal_RAG_vdms.ipynb) | Perform retrieval-augmented generation (rag) on documents including text and images, using unstructured for parsing, Intel's Visual Data Management System (VDMS) as the vectorstore, and chains.
 [multi_player_dnd.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/multi_player_dnd.ipynb) | Simulate multi-player dungeons & dragons games, with a custom function determining the speaking schedule of the agents.
 [multiagent_authoritarian.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/multiagent_authoritarian.ipynb) | Implement a multi-agent simulation where a privileged agent controls the conversation, including deciding who speaks and when the conversation ends, in the context of a simulated news network.
 [multiagent_bidding.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/multiagent_bidding.ipynb) | Implement a multi-agent simulation where agents bid to speak, with the highest bidder speaking next, demonstrated through a fictitious presidential debate example.
diff --git a/cookbook/multi_modal_RAG_vdms.ipynb b/cookbook/multi_modal_RAG_vdms.ipynb
index 49fcce642cf1a..20a19810cf286 100644
--- a/cookbook/multi_modal_RAG_vdms.ipynb
+++ b/cookbook/multi_modal_RAG_vdms.ipynb
@@ -18,26 +18,7 @@
     "* Use of multimodal embeddings (such as [CLIP](https://openai.com/research/clip)) to embed images and text\n",
     "* Use of [VDMS](https://github.com/IntelLabs/vdms/blob/master/README.md) as a vector store with support for multi-modal\n",
     "* Retrieval of both images and text using similarity search\n",
-    "* Passing raw images and text chunks to a multimodal LLM for answer synthesis \n",
-    "\n",
-    "\n",
-    "## Packages\n",
-    "\n",
-    "For `unstructured`, you will also need `poppler` ([installation instructions](https://pdf2image.readthedocs.io/en/latest/installation.html)) and `tesseract` ([installation instructions](https://tesseract-ocr.github.io/tessdoc/Installation.html)) in your system."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 1,
-   "id": "febbc459-ebba-4c1a-a52b-fed7731593f8",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# (newest versions required for multi-modal)\n",
-    "! pip install --quiet -U vdms langchain-experimental\n",
-    "\n",
-    "# lock to 0.10.19 due to a persistent bug in more recent versions\n",
-    "! pip install --quiet pdf2image \"unstructured[all-docs]==0.10.19\" pillow pydantic lxml open_clip_torch"
+    "* Passing raw images and text chunks to a multimodal LLM for answer synthesis "
    ]
   },
   {
@@ -53,7 +34,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 3,
+   "execution_count": 1,
    "id": "5f483872",
    "metadata": {},
    "outputs": [
@@ -61,8 +42,7 @@
      "name": "stdout",
      "output_type": "stream",
      "text": [
-      "docker: Error response from daemon: Conflict. The container name \"/vdms_rag_nb\" is already in use by container \"0c19ed281463ac10d7efe07eb815643e3e534ddf24844357039453ad2b0c27e8\". You have to remove (or rename) that container to be able to reuse that name.\n",
-      "See 'docker run --help'.\n"
+      "a1b9206b08ef626e15b356bf9e031171f7c7eb8f956a2733f196f0109246fe2b\n"
      ]
     }
    ],
@@ -75,9 +55,32 @@
     "vdms_client = VDMS_Client(port=55559)"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "id": "2498a0a1",
+   "metadata": {},
+   "source": [
+    "## Packages\n",
+    "\n",
+    "For `unstructured`, you will also need `poppler` ([installation instructions](https://pdf2image.readthedocs.io/en/latest/installation.html)) and `tesseract` ([installation instructions](https://tesseract-ocr.github.io/tessdoc/Installation.html)) in your system."
+   ]
+  },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 2,
+   "id": "febbc459-ebba-4c1a-a52b-fed7731593f8",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "! pip install --quiet -U vdms langchain-experimental\n",
+    "\n",
+    "# lock to 0.10.19 due to a persistent bug in more recent versions\n",
+    "! pip install --quiet pdf2image \"unstructured[all-docs]==0.10.19\" pillow pydantic lxml open_clip_torch"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
    "id": "78ac6543",
    "metadata": {},
    "outputs": [],
@@ -95,14 +98,9 @@
     "\n",
     "### Partition PDF text and images\n",
     "  \n",
-    "Let's look at an example pdf containing interesting images.\n",
-    "\n",
-    "Famous photographs from library of congress:\n",
+    "Let's use famous photographs from the PDF version of Library of Congress Magazine in this example.\n",
     "\n",
-    "* https://www.loc.gov/lcm/pdf/LCM_2020_1112.pdf\n",
-    "* We'll use this as an example below\n",
-    "\n",
-    "We can use `partition_pdf` below from [Unstructured](https://unstructured-io.github.io/unstructured/introduction.html#key-concepts) to extract text and images."
+    "We can use `partition_pdf` from [Unstructured](https://unstructured-io.github.io/unstructured/introduction.html#key-concepts) to extract text and images."
    ]
   },
   {
@@ -116,8 +114,8 @@
     "\n",
     "import requests\n",
     "\n",
-    "# Folder with pdf and extracted images\n",
-    "datapath = Path(\"./multimodal_files\").resolve()\n",
+    "# Folder to store pdf and extracted images\n",
+    "datapath = Path(\"./data/multimodal_files\").resolve()\n",
     "datapath.mkdir(parents=True, exist_ok=True)\n",
     "\n",
     "pdf_url = \"https://www.loc.gov/lcm/pdf/LCM_2020_1112.pdf\"\n",
@@ -174,14 +172,8 @@
    "source": [
     "## Multi-modal embeddings with our document\n",
     "\n",
-    "We will use [OpenClip multimodal embeddings](https://python.langchain.com/docs/integrations/text_embedding/open_clip).\n",
-    "\n",
-    "We use a larger model for better performance (set in `langchain_experimental.open_clip.py`).\n",
-    "\n",
-    "```\n",
-    "model_name = \"ViT-g-14\"\n",
-    "checkpoint = \"laion2b_s34b_b88k\"\n",
-    "```"
+    "In this section, we initialize the VDMS vector store for both text and images. For better performance, we use model `ViT-g-14` from [OpenClip multimodal embeddings](https://python.langchain.com/docs/integrations/text_embedding/open_clip).\n",
+    "The images are stored as base64 encoded strings with `vectorstore.add_images`.\n"
    ]
   },
   {
@@ -200,9 +192,7 @@
     "vectorstore = VDMS(\n",
     "    client=vdms_client,\n",
     "    collection_name=\"mm_rag_clip_photos\",\n",
-    "    embedding_function=OpenCLIPEmbeddings(\n",
-    "        model_name=\"ViT-g-14\", checkpoint=\"laion2b_s34b_b88k\"\n",
-    "    ),\n",
+    "    embedding=OpenCLIPEmbeddings(model_name=\"ViT-g-14\", checkpoint=\"laion2b_s34b_b88k\"),\n",
     ")\n",
     "\n",
     "# Get image URIs with .jpg extension only\n",
@@ -233,7 +223,7 @@
    "source": [
     "## RAG\n",
     "\n",
-    "`vectorstore.add_images` will store / retrieve images as base64 encoded strings."
+    "Here we define helper functions for image results."
    ]
   },
   {
@@ -392,7 +382,8 @@
    "id": "1566096d-97c2-4ddc-ba4a-6ef88c525e4e",
    "metadata": {},
    "source": [
-    "## Test retrieval and run RAG"
+    "## Test retrieval and run RAG\n",
+    "Now let's query for a `woman with children` and retrieve the top results."
    ]
   },
   {
@@ -452,6 +443,14 @@
     "        print(doc.page_content)"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "id": "15e9b54d",
+   "metadata": {},
+   "source": [
+    "Now let's use the `multi_modal_rag_chain` to process the same query and display the response."
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": 11,
@@ -462,10 +461,10 @@
      "name": "stdout",
      "output_type": "stream",
      "text": [
-      "1. Detailed description of the visual elements in the image: The image features a woman with children, likely a mother and her family, standing together outside. They appear to be poor or struggling financially, as indicated by their attire and surroundings.\n",
-      "2. Historical and cultural context of the image: The photo was taken in 1936 during the Great Depression, when many families struggled to make ends meet. Dorothea Lange, a renowned American photographer, took this iconic photograph that became an emblem of poverty and hardship experienced by many Americans at that time.\n",
-      "3. Interpretation of the image's symbolism and meaning: The image conveys a sense of unity and resilience despite adversity. The woman and her children are standing together, displaying their strength as a family unit in the face of economic challenges. The photograph also serves as a reminder of the importance of empathy and support for those who are struggling.\n",
-      "4. Connections between the image and the related text: The text provided offers additional context about the woman in the photo, her background, and her feelings towards the photograph. It highlights the historical backdrop of the Great Depression and emphasizes the significance of this particular image as a representation of that time period.\n"
+      " The image depicts a woman with several children. The woman appears to be of Cherokee heritage, as suggested by the text provided. The image is described as having been initially regretted by the subject, Florence Owens Thompson, due to her feeling that it did not accurately represent her leadership qualities.\n",
+      "The historical and cultural context of the image is tied to the Great Depression and the Dust Bowl, both of which affected the Cherokee people in Oklahoma. The photograph was taken during this period, and its subject, Florence Owens Thompson, was a leader within her community who worked tirelessly to help those affected by these crises.\n",
+      "The image's symbolism and meaning can be interpreted as a representation of resilience and strength in the face of adversity. The woman is depicted with multiple children, which could signify her role as a caregiver and protector during difficult times.\n",
+      "Connections between the image and the related text include Florence Owens Thompson's leadership qualities and her regretted feelings about the photograph. Additionally, the mention of Dorothea Lange, the photographer who took this photo, ties the image to its historical context and the broader narrative of the Great Depression and Dust Bowl in Oklahoma. \n"
      ]
     }
    ],
@@ -492,14 +491,6 @@
    "source": [
     "! docker kill vdms_rag_nb"
    ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "8ba652da",
-   "metadata": {},
-   "outputs": [],
-   "source": []
   }
  ],
  "metadata": {
@@ -518,7 +509,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.10.13"
+   "version": "3.11.9"
   }
  },
  "nbformat": 4,
diff --git a/docs/docs/concepts.mdx b/docs/docs/concepts.mdx
index ee213c19796e2..da8687508560d 100644
--- a/docs/docs/concepts.mdx
+++ b/docs/docs/concepts.mdx
@@ -165,7 +165,7 @@ Some important things to note:
 ChatModels also accept other parameters that are specific to that integration. To find all the parameters supported by a ChatModel head to the API reference for that model.
 
 :::important
-**Tool Calling** Some chat models have been fine-tuned for tool calling and provide a dedicated API for tool calling.
+Some chat models have been fine-tuned for **tool calling** and provide a dedicated API for it.
 Generally, such models are better at tool calling than non-fine-tuned models, and are recommended for use cases that require tool calling.
 Please see the [tool calling section](/docs/concepts/#functiontool-calling) for more information.
 :::
@@ -255,7 +255,7 @@ This represents the result of a tool call. In addition to `role` and `content`,
 
 #### (Legacy) FunctionMessage
 
-This is a legacy message type, corresponding to OpenAI's legacy function-calling API. ToolMessage should be used instead to correspond to the updated tool-calling API.
+This is a legacy message type, corresponding to OpenAI's legacy function-calling API. `ToolMessage` should be used instead to correspond to the updated tool-calling API.
 
 This represents the result of a function call. In addition to `role` and `content`, this message has a `name` parameter which conveys the name of the function that was called to produce this result.
 
@@ -826,6 +826,61 @@ units (like words or subwords) that carry meaning, rather than individual charac
 to learn and understand the structure of the language, including grammar and context.
 Furthermore, using tokens can also improve efficiency, since the model processes fewer units of text compared to character-level processing.
 
+### Function/tool calling
+
+:::info
+We use the term tool calling interchangeably with function calling. Although
+function calling is sometimes meant to refer to invocations of a single function,
+we treat all models as though they can return multiple tool or function calls in
+each message.
+:::
+
+Tool calling allows a [chat model](/docs/concepts/#chat-models) to respond to a given prompt by generating output that
+matches a user-defined schema.
+
+While the name implies that the model is performing
+some action, this is actually not the case! The model only generates the arguments to a tool, and actually running the tool (or not) is up to the user.
+One common example where you **wouldn't** want to call a function with the generated arguments
+is if you want to [extract structured output matching some schema](/docs/concepts/#structured-output)
+from unstructured text. You would give the model an "extraction" tool that takes
+parameters matching the desired schema, then treat the generated output as your final
+result.
+
+![Diagram of a tool call by a chat model](/img/tool_call.png)
+
+Tool calling is not universal, but is supported by many popular LLM providers, including [Anthropic](/docs/integrations/chat/anthropic/), 
+[Cohere](/docs/integrations/chat/cohere/), [Google](/docs/integrations/chat/google_vertex_ai_palm/), 
+[Mistral](/docs/integrations/chat/mistralai/), [OpenAI](/docs/integrations/chat/openai/), and even for locally-running models via [Ollama](/docs/integrations/chat/ollama/).
+
+LangChain provides a standardized interface for tool calling that is consistent across different models.
+
+The standard interface consists of:
+
+* `ChatModel.bind_tools()`: a method for specifying which tools are available for a model to call. This method accepts [LangChain tools](/docs/concepts/#tools) as well as [Pydantic](https://pydantic.dev/) objects.
+* `AIMessage.tool_calls`: an attribute on the `AIMessage` returned from the model for accessing the tool calls requested by the model.
+
+#### Tool usage
+
+After the model calls tools, you can use the tool by invoking it, then passing the arguments back to the model.
+LangChain provides the [`Tool`](/docs/concepts/#tools) abstraction to help you handle this.
+
+The general flow is this:
+
+1. Generate tool calls with a chat model in response to a query.
+2. Invoke the appropriate tools using the generated tool call as arguments.
+3. Format the result of the tool invocations as [`ToolMessages`](/docs/concepts/#toolmessage).
+4. Pass the entire list of messages back to the model so that it can generate a final answer (or call more tools).
+
+![Diagram of a complete tool calling flow](/img/tool_calling_flow.png)
+
+This is how tool calling [agents](/docs/concepts/#agents) perform tasks and answer queries.
+
+Check out some more focused guides below:
+
+- [How to use chat models to call tools](/docs/how_to/tool_calling/)
+- [How to pass tool outputs to chat models](/docs/how_to/tool_results_pass_to_model/)
+- [Building an agent with LangGraph](https://langchain-ai.github.io/langgraph/tutorials/introduction/)
+
 ### Structured output
 
 LLMs are capable of generating arbitrary text. This enables the model to respond appropriately to a wide
@@ -958,48 +1013,48 @@ chain.invoke({ "question": "What is the powerhouse of the cell?" })
 
 For a full list of model providers that support JSON mode, see [this table](/docs/integrations/chat/#advanced-features).
 
-#### Function/tool calling
+#### Tool calling {#structured-output-tool-calling}
 
-:::info
-We use the term tool calling interchangeably with function calling. Although
-function calling is sometimes meant to refer to invocations of a single function,
-we treat all models as though they can return multiple tool or function calls in
-each message
-:::
+For models that support it, [tool calling](/docs/concepts/#functiontool-calling) can be very convenient for structured output. It removes the
+guesswork around how best to prompt schemas in favor of a built-in model feature.
 
-Tool calling allows a model to respond to a given prompt by generating output that
-matches a user-defined schema. While the name implies that the model is performing
-some action, this is actually not the case! The model is coming up with the
-arguments to a tool, and actually running the tool (or not) is up to the user -
-for example, if you want to [extract output matching some schema](/docs/tutorials/extraction)
-from unstructured text, you could give the model an "extraction" tool that takes
-parameters matching the desired schema, then treat the generated output as your final
-result.
+It works by first binding the desired schema either directly or via a [LangChain tool](/docs/concepts/#tools) to a
+[chat model](/docs/concepts/#chat-models) using the `.bind_tools()` method. The model will then generate an `AIMessage` containing
+a `tool_calls` field containing `args` that match the desired shape.
 
-For models that support it, tool calling can be very convenient. It removes the
-guesswork around how best to prompt schemas in favor of a built-in model feature. It can also
-more naturally support agentic flows, since you can just pass multiple tool schemas instead
-of fiddling with enums or unions.
-
-Many LLM providers, including [Anthropic](https://www.anthropic.com/),
-[Cohere](https://cohere.com/), [Google](https://cloud.google.com/vertex-ai),
-[Mistral](https://mistral.ai/), [OpenAI](https://openai.com/), and others,
-support variants of a tool calling feature. These features typically allow requests
-to the LLM to include available tools and their schemas, and for responses to include
-calls to these tools. For instance, given a search engine tool, an LLM might handle a
-query by first issuing a call to the search engine. The system calling the LLM can
-receive the tool call, execute it, and return the output to the LLM to inform its
-response. LangChain includes a suite of [built-in tools](/docs/integrations/tools/)
-and supports several methods for defining your own [custom tools](/docs/how_to/custom_tools).
+There are several acceptable formats you can use to bind tools to a model in LangChain. Here's one example:
 
-LangChain provides a standardized interface for tool calling that is consistent across different models.
+```python
+from langchain_core.pydantic_v1 import BaseModel, Field
+from langchain_openai import ChatOpenAI
 
-The standard interface consists of:
+class ResponseFormatter(BaseModel):
+    """Always use this tool to structure your response to the user."""
 
-* `ChatModel.bind_tools()`: a method for specifying which tools are available for a model to call. This method accepts [LangChain tools](/docs/concepts/#tools) here.
-* `AIMessage.tool_calls`: an attribute on the `AIMessage` returned from the model for accessing the tool calls requested by the model.
+    answer: str = Field(description="The answer to the user's question")
+    followup_question: str = Field(description="A followup question the user could ask")
+
+model = ChatOpenAI(
+    model="gpt-4o",
+    temperature=0,
+)
+
+model_with_tools = model.bind_tools([ResponseFormatter])
+
+ai_msg = model_with_tools.invoke("What is the powerhouse of the cell?")
+
+ai_msg.tool_calls[0]["args"]
+```
+
+```
+{'answer': "The powerhouse of the cell is the mitochondrion. It generates most of the cell's supply of adenosine triphosphate (ATP), which is used as a source of chemical energy.",
+ 'followup_question': 'How do mitochondria generate ATP?'}
+```
+
+Tool calling is a generally consistent way to get a model to generate structured output, and is the default technique
+used for the [`.with_structured_output()`](/docs/concepts/#with_structured_output) method when a model supports it.
 
-The following how-to guides are good practical resources for using function/tool calling:
+The following how-to guides are good practical resources for using function/tool calling for structured output:
 
 - [How to return structured data from an LLM](/docs/how_to/structured_output/)
 - [How to use a model to call tools](/docs/how_to/tool_calling)
diff --git a/docs/docs/how_to/tool_calling.ipynb b/docs/docs/how_to/tool_calling.ipynb
index c26e70ade3242..06e375967c1c2 100644
--- a/docs/docs/how_to/tool_calling.ipynb
+++ b/docs/docs/how_to/tool_calling.ipynb
@@ -22,57 +22,36 @@
     ":::info Prerequisites\n",
     "\n",
     "This guide assumes familiarity with the following concepts:\n",
+    "\n",
     "- [Chat models](/docs/concepts/#chat-models)\n",
     "- [LangChain Tools](/docs/concepts/#tools)\n",
+    "- [Tool calling](/docs/concepts/#functiontool-calling)\n",
     "- [Output parsers](/docs/concepts/#output-parsers)\n",
     "\n",
     ":::\n",
     "\n",
-    ":::info Tool calling vs function calling\n",
-    "\n",
-    "We use the term tool calling interchangeably with function calling. Although\n",
-    "function calling is sometimes meant to refer to invocations of a single function,\n",
-    "we treat all models as though they can return multiple tool or function calls in \n",
-    "each message.\n",
+    "[Tool calling](/docs/concepts/#functiontool-calling) allows a chat model to respond to a given prompt by \"calling a tool\".\n",
     "\n",
-    ":::\n",
+    "Remember, while the name \"tool calling\" implies that the model is directly performing some action, this is actually not the case! The model only generates the arguments to a tool, and actually running the tool (or not) is up to the user.\n",
     "\n",
-    ":::info Supported models\n",
+    "Tool calling is a general technique that generates structured output from a model, and you can use it even when you don't intend to invoke any tools. An example use-case of that is [extraction from unstructured text](/docs/tutorials/extraction/).\n",
     "\n",
-    "You can find a [list of all models that support tool calling](/docs/integrations/chat/).\n",
+    "![Diagram of calling a tool](/img/tool_call.png)\n",
     "\n",
-    ":::\n",
+    "If you want to see how to use the model-generated tool call to actually run a tool function [check out this guide](/docs/how_to/tool_results_pass_to_model/).\n",
     "\n",
-    "Tool calling allows a chat model to respond to a given prompt by \"calling a tool\".\n",
-    "While the name implies that the model is performing \n",
-    "some action, this is actually not the case! The model generates the \n",
-    "arguments to a tool, and actually running the tool (or not) is up to the user.\n",
-    "For example, if you want to [extract output matching some schema](/docs/how_to/structured_output/) \n",
-    "from unstructured text, you could give the model an \"extraction\" tool that takes \n",
-    "parameters matching the desired schema, then treat the generated output as your final \n",
-    "result.\n",
+    ":::note Supported models\n",
     "\n",
-    ":::note\n",
+    "Tool calling is not universal, but is supported by many popular LLM providers, including [Anthropic](/docs/integrations/chat/anthropic/), \n",
+    "[Cohere](/docs/integrations/chat/cohere/), [Google](/docs/integrations/chat/google_vertex_ai_palm/), \n",
+    "[Mistral](/docs/integrations/chat/mistralai/), [OpenAI](/docs/integrations/chat/openai/), and even for locally-running models via [Ollama](/docs/integrations/chat/ollama/).\n",
     "\n",
-    "If you only need formatted values, try the [.with_structured_output()](/docs/how_to/structured_output/#the-with_structured_output-method) chat model method as a simpler entrypoint.\n",
+    "You can find a [list of all models that support tool calling here](/docs/integrations/chat/).\n",
     "\n",
     ":::\n",
     "\n",
-    "However, tool calling goes beyond [structured output](/docs/how_to/structured_output/)\n",
-    "since you can pass responses from called tools back to the model to create longer interactions.\n",
-    "For instance, given a search engine tool, an LLM might handle a \n",
-    "query by first issuing a call to the search engine with arguments. The system calling the LLM can \n",
-    "receive the tool call, execute it, and return the output to the LLM to inform its \n",
-    "response. LangChain includes a suite of [built-in tools](/docs/integrations/tools/) \n",
-    "and supports several methods for defining your own [custom tools](/docs/how_to/custom_tools). \n",
-    "\n",
-    "Tool calling is not universal, but many popular LLM providers, including [Anthropic](https://www.anthropic.com/), \n",
-    "[Cohere](https://cohere.com/), [Google](https://cloud.google.com/vertex-ai), \n",
-    "[Mistral](https://mistral.ai/), [OpenAI](https://openai.com/), and others, \n",
-    "support variants of a tool calling feature.\n",
-    "\n",
-    "LangChain implements standard interfaces for defining tools, passing them to LLMs, \n",
-    "and representing tool calls. This guide and the other How-to pages in the Tool section will show you how to use tools with LangChain."
+    "LangChain implements standard interfaces for defining tools, passing them to LLMs, and representing tool calls.\n",
+    "This guide will cover how to bind tools to an LLM, then invoke the LLM to generate these arguments."
    ]
   },
   {
@@ -91,7 +70,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 2,
+   "execution_count": 1,
    "metadata": {},
    "outputs": [],
    "source": [
@@ -112,14 +91,14 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "LangChain also implements a `@tool` decorator that allows for further control of the tool schema, such as tool names and argument descriptions. See the how-to guide [here](/docs/how_to/custom_tools/#creating-tools-from-functions) for detail.\n",
+    "LangChain also implements a `@tool` decorator that allows for further control of the tool schema, such as tool names and argument descriptions. See the how-to guide [here](/docs/how_to/custom_tools/#creating-tools-from-functions) for details.\n",
     "\n",
-    "We can also define the schema using [Pydantic](https://docs.pydantic.dev):"
+    "We can also define the schemas without the accompanying functions using [Pydantic](https://docs.pydantic.dev):"
    ]
   },
   {
    "cell_type": "code",
-   "execution_count": 1,
+   "execution_count": 2,
    "metadata": {},
    "outputs": [],
    "source": [
@@ -149,7 +128,8 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "We can bind them to chat models as follows:\n",
+    "To actually bind those schemas to a chat model, we'll use the `.bind_tools()` method. This handles converting\n",
+    "the `Add` and `Multiply` schemas to the proper format for the model. The tool schema will then be passed it in each time the model is invoked.\n",
     "\n",
     "```{=mdx}\n",
     "import ChatModelTabs from \"@theme/ChatModelTabs\";\n",
@@ -158,11 +138,7 @@
     "  customVarName=\"llm\"\n",
     "  fireworksParams={`model=\"accounts/fireworks/models/firefunction-v1\", temperature=0`}\n",
     "/>\n",
-    "```\n",
-    "\n",
-    "We'll use the `.bind_tools()` method to handle converting\n",
-    "`Multiply` to the proper format for the model, then and bind it (i.e.,\n",
-    "passing it in each time the model is invoked)."
+    "```"
    ]
   },
   {
@@ -183,7 +159,7 @@
     "\n",
     "os.environ[\"OPENAI_API_KEY\"] = getpass()\n",
     "\n",
-    "llm = ChatOpenAI(model=\"gpt-3.5-turbo-0125\", temperature=0)"
+    "llm = ChatOpenAI(model=\"gpt-4o-mini\", temperature=0)"
    ]
   },
   {
@@ -194,7 +170,7 @@
     {
      "data": {
       "text/plain": [
-       "AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_g4RuAijtDcSeM96jXyCuiLSN', 'function': {'arguments': '{\"a\":3,\"b\":12}', 'name': 'Multiply'}, 'type': 'function'}]}, response_metadata={'token_usage': {'completion_tokens': 18, 'prompt_tokens': 95, 'total_tokens': 113}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'tool_calls', 'logprobs': None}, id='run-5157d15a-7e0e-4ab1-af48-3d98010cd152-0', tool_calls=[{'name': 'Multiply', 'args': {'a': 3, 'b': 12}, 'id': 'call_g4RuAijtDcSeM96jXyCuiLSN'}], usage_metadata={'input_tokens': 95, 'output_tokens': 18, 'total_tokens': 113})"
+       "AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_wLTBasMppAwpdiA5CD92l9x7', 'function': {'arguments': '{\"a\":3,\"b\":12}', 'name': 'Multiply'}, 'type': 'function'}]}, response_metadata={'token_usage': {'completion_tokens': 18, 'prompt_tokens': 89, 'total_tokens': 107}, 'model_name': 'gpt-4o-mini-2024-07-18', 'system_fingerprint': 'fp_0f03d4f0ee', 'finish_reason': 'tool_calls', 'logprobs': None}, id='run-d3f36cca-f225-416f-ac16-0217046f0b38-0', tool_calls=[{'name': 'Multiply', 'args': {'a': 3, 'b': 12}, 'id': 'call_wLTBasMppAwpdiA5CD92l9x7', 'type': 'tool_call'}], usage_metadata={'input_tokens': 89, 'output_tokens': 18, 'total_tokens': 107})"
       ]
      },
      "execution_count": 4,
@@ -214,7 +190,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "As we can see, even though the prompt didn't really suggest a tool call, our LLM made one since it was forced to do so. You can look at the docs for [bind_tools()](https://api.python.langchain.com/en/latest/chat_models/langchain_openai.chat_models.base.BaseChatOpenAI.html#langchain_openai.chat_models.base.BaseChatOpenAI.bind_tools) to learn about all the ways to customize how your LLM selects tools."
+    "As we can see our LLM generated arguments to a tool! You can look at the docs for [bind_tools()](https://api.python.langchain.com/en/latest/chat_models/langchain_openai.chat_models.base.BaseChatOpenAI.html#langchain_openai.chat_models.base.BaseChatOpenAI.bind_tools) to learn about all the ways to customize how your LLM selects tools, as well as [this guide on how to force the LLM to call a tool](/docs/how_to/tool_choice/) rather than letting it decide."
    ]
   },
   {
@@ -246,10 +222,12 @@
       "text/plain": [
        "[{'name': 'Multiply',\n",
        "  'args': {'a': 3, 'b': 12},\n",
-       "  'id': 'call_TnadLbWJu9HwDULRb51RNSMw'},\n",
+       "  'id': 'call_uqJsNrDJ8ZZnFa1BHHYAllEv',\n",
+       "  'type': 'tool_call'},\n",
        " {'name': 'Add',\n",
        "  'args': {'a': 11, 'b': 49},\n",
-       "  'id': 'call_Q9vt1up05sOQScXvUYWzSpCg'}]"
+       "  'id': 'call_ud1uHAaYsdpWuxugwoJ63BDs',\n",
+       "  'type': 'tool_call'}]"
       ]
      },
      "execution_count": 5,
@@ -308,17 +286,17 @@
    "source": [
     "## Next steps\n",
     "\n",
-    "Now you've learned how to bind tool schemas to a chat model and to call those tools. Next, you can learn more about how to use tools:\n",
+    "Now you've learned how to bind tool schemas to a chat model and have the model call the tool.\n",
+    "\n",
+    "Next, check out this guide on actually using the tool by invoking the function and passing the results back to the model:\n",
     "\n",
-    "- Few shot promting [with tools](/docs/how_to/tools_few_shot/)\n",
-    "- Stream [tool calls](/docs/how_to/tool_streaming/)\n",
-    "- Bind [model-specific tools](/docs/how_to/tools_model_specific/)\n",
-    "- Pass [runtime values to tools](/docs/how_to/tool_runtime)\n",
     "- Pass [tool results back to model](/docs/how_to/tool_results_pass_to_model)\n",
     "\n",
     "You can also check out some more specific uses of tool calling:\n",
     "\n",
-    "- Building [tool-using chains and agents](/docs/how_to#tools)\n",
+    "- Few shot prompting [with tools](/docs/how_to/tools_few_shot/)\n",
+    "- Stream [tool calls](/docs/how_to/tool_streaming/)\n",
+    "- Pass [runtime values to tools](/docs/how_to/tool_runtime)\n",
     "- Getting [structured outputs](/docs/how_to/structured_output/) from models"
    ]
   }
@@ -339,7 +317,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.10.4"
+   "version": "3.10.5"
   }
  },
  "nbformat": 4,
diff --git a/docs/docs/how_to/tool_results_pass_to_model.ipynb b/docs/docs/how_to/tool_results_pass_to_model.ipynb
index 07e116b7cd12c..ac17ae7749436 100644
--- a/docs/docs/how_to/tool_results_pass_to_model.ipynb
+++ b/docs/docs/how_to/tool_results_pass_to_model.ipynb
@@ -9,12 +9,34 @@
     ":::info Prerequisites\n",
     "This guide assumes familiarity with the following concepts:\n",
     "\n",
-    "- [Tools](/docs/concepts/#tools)\n",
+    "- [LangChain Tools](/docs/concepts/#tools)\n",
     "- [Function/tool calling](/docs/concepts/#functiontool-calling)\n",
+    "- [Using chat models to call tools](/docs/how_to/tool_calling)\n",
+    "- [Defining custom tools](/docs/how_to/custom_tools/)\n",
     "\n",
     ":::\n",
     "\n",
-    "If we're using the model-generated tool invocations to actually call tools and want to pass the tool results back to the model, we can do so using `ToolMessage`s and `ToolCall`s. First, let's define our tools and our model."
+    "Some models are capable of [**tool calling**](/docs/concepts/#functiontool-calling) - generating arguments that conform to a specific user-provided schema. This guide will demonstrate how to use those tool cals to actually call a function and properly pass the results back to the model.\n",
+    "\n",
+    "![Diagram of a tool call invocation](/img/tool_invocation.png)\n",
+    "\n",
+    "![Diagram of a tool call result](/img/tool_results.png)\n",
+    "\n",
+    "First, let's define our tools and our model:"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "```{=mdx}\n",
+    "import ChatModelTabs from \"@theme/ChatModelTabs\";\n",
+    "\n",
+    "<ChatModelTabs\n",
+    "  customVarName=\"llm\"\n",
+    "  fireworksParams={`model=\"accounts/fireworks/models/firefunction-v1\", temperature=0`}\n",
+    "/>\n",
+    "```"
    ]
   },
   {
@@ -22,6 +44,25 @@
    "execution_count": 1,
    "metadata": {},
    "outputs": [],
+   "source": [
+    "# | output: false\n",
+    "# | echo: false\n",
+    "\n",
+    "import os\n",
+    "from getpass import getpass\n",
+    "\n",
+    "from langchain_openai import ChatOpenAI\n",
+    "\n",
+    "os.environ[\"OPENAI_API_KEY\"] = getpass()\n",
+    "\n",
+    "llm = ChatOpenAI(model=\"gpt-4o-mini\", temperature=0)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "metadata": {},
+   "outputs": [],
    "source": [
     "from langchain_core.tools import tool\n",
     "\n",
@@ -38,85 +79,109 @@
     "    return a * b\n",
     "\n",
     "\n",
-    "tools = [add, multiply]"
+    "tools = [add, multiply]\n",
+    "\n",
+    "llm_with_tools = llm.bind_tools(tools)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Now, let's get the model to call a tool. We'll add it to a list of messages that we'll treat as conversation history:"
    ]
   },
   {
    "cell_type": "code",
-   "execution_count": 2,
+   "execution_count": 6,
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "[{'name': 'multiply', 'args': {'a': 3, 'b': 12}, 'id': 'call_GPGPE943GORirhIAYnWv00rK', 'type': 'tool_call'}, {'name': 'add', 'args': {'a': 11, 'b': 49}, 'id': 'call_dm8o64ZrY3WFZHAvCh1bEJ6i', 'type': 'tool_call'}]\n"
+     ]
+    }
+   ],
    "source": [
-    "import os\n",
-    "from getpass import getpass\n",
+    "from langchain_core.messages import HumanMessage\n",
     "\n",
-    "from langchain_openai import ChatOpenAI\n",
+    "query = \"What is 3 * 12? Also, what is 11 + 49?\"\n",
     "\n",
-    "os.environ[\"OPENAI_API_KEY\"] = getpass()\n",
+    "messages = [HumanMessage(query)]\n",
     "\n",
-    "llm = ChatOpenAI(model=\"gpt-3.5-turbo-0125\", temperature=0)\n",
-    "llm_with_tools = llm.bind_tools(tools)"
+    "ai_msg = llm_with_tools.invoke(messages)\n",
+    "\n",
+    "print(ai_msg.tool_calls)\n",
+    "\n",
+    "messages.append(ai_msg)"
    ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "The nice thing about Tools is that if we invoke them with a ToolCall, we'll automatically get back a ToolMessage that can be fed back to the model: \n",
+    "Next let's invoke the tool functions using the args the model populated!\n",
     "\n",
-    ":::info Requires ``langchain-core >= 0.2.19``\n",
+    "Conveniently, if we invoke a LangChain `Tool` with a `ToolCall`, we'll automatically get back a `ToolMessage` that can be fed back to the model:\n",
     "\n",
-    "This functionality was added in ``langchain-core == 0.2.19``. Please make sure your package is up to date.\n",
+    ":::caution Compatibility\n",
+    "\n",
+    "This functionality was added in `langchain-core == 0.2.19`. Please make sure your package is up to date.\n",
+    "\n",
+    "If you are on earlier versions of `langchain-core`, you will need to extract the `args` field from the tool and construct a `ToolMessage` manually.\n",
     "\n",
     ":::"
    ]
   },
   {
    "cell_type": "code",
-   "execution_count": 5,
+   "execution_count": 4,
    "metadata": {},
    "outputs": [
     {
      "data": {
       "text/plain": [
        "[HumanMessage(content='What is 3 * 12? Also, what is 11 + 49?'),\n",
-       " AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_Smg3NHJNxrKfAmd4f9GkaYn3', 'function': {'arguments': '{\"a\": 3, \"b\": 12}', 'name': 'multiply'}, 'type': 'function'}, {'id': 'call_55K1C0DmH6U5qh810gW34xZ0', 'function': {'arguments': '{\"a\": 11, \"b\": 49}', 'name': 'add'}, 'type': 'function'}]}, response_metadata={'token_usage': {'completion_tokens': 49, 'prompt_tokens': 88, 'total_tokens': 137}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'tool_calls', 'logprobs': None}, id='run-56657feb-96dd-456c-ab8e-1857eab2ade0-0', tool_calls=[{'name': 'multiply', 'args': {'a': 3, 'b': 12}, 'id': 'call_Smg3NHJNxrKfAmd4f9GkaYn3', 'type': 'tool_call'}, {'name': 'add', 'args': {'a': 11, 'b': 49}, 'id': 'call_55K1C0DmH6U5qh810gW34xZ0', 'type': 'tool_call'}], usage_metadata={'input_tokens': 88, 'output_tokens': 49, 'total_tokens': 137}),\n",
-       " ToolMessage(content='36', name='multiply', tool_call_id='call_Smg3NHJNxrKfAmd4f9GkaYn3'),\n",
-       " ToolMessage(content='60', name='add', tool_call_id='call_55K1C0DmH6U5qh810gW34xZ0')]"
+       " AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_loT2pliJwJe3p7nkgXYF48A1', 'function': {'arguments': '{\"a\": 3, \"b\": 12}', 'name': 'multiply'}, 'type': 'function'}, {'id': 'call_bG9tYZCXOeYDZf3W46TceoV4', 'function': {'arguments': '{\"a\": 11, \"b\": 49}', 'name': 'add'}, 'type': 'function'}]}, response_metadata={'token_usage': {'completion_tokens': 50, 'prompt_tokens': 87, 'total_tokens': 137}, 'model_name': 'gpt-4o-mini-2024-07-18', 'system_fingerprint': 'fp_661538dc1f', 'finish_reason': 'tool_calls', 'logprobs': None}, id='run-e3db3c46-bf9e-478e-abc1-dc9a264f4afe-0', tool_calls=[{'name': 'multiply', 'args': {'a': 3, 'b': 12}, 'id': 'call_loT2pliJwJe3p7nkgXYF48A1', 'type': 'tool_call'}, {'name': 'add', 'args': {'a': 11, 'b': 49}, 'id': 'call_bG9tYZCXOeYDZf3W46TceoV4', 'type': 'tool_call'}], usage_metadata={'input_tokens': 87, 'output_tokens': 50, 'total_tokens': 137}),\n",
+       " ToolMessage(content='36', name='multiply', tool_call_id='call_loT2pliJwJe3p7nkgXYF48A1'),\n",
+       " ToolMessage(content='60', name='add', tool_call_id='call_bG9tYZCXOeYDZf3W46TceoV4')]"
       ]
      },
-     "execution_count": 5,
+     "execution_count": 4,
      "metadata": {},
      "output_type": "execute_result"
     }
    ],
    "source": [
-    "from langchain_core.messages import HumanMessage, ToolMessage\n",
-    "\n",
-    "query = \"What is 3 * 12? Also, what is 11 + 49?\"\n",
-    "\n",
-    "messages = [HumanMessage(query)]\n",
-    "ai_msg = llm_with_tools.invoke(messages)\n",
-    "messages.append(ai_msg)\n",
     "for tool_call in ai_msg.tool_calls:\n",
     "    selected_tool = {\"add\": add, \"multiply\": multiply}[tool_call[\"name\"].lower()]\n",
     "    tool_msg = selected_tool.invoke(tool_call)\n",
     "    messages.append(tool_msg)\n",
+    "\n",
     "messages"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "And finally, we'll invoke the model with the tool results. The model will use this information to generate a final answer to our original query:"
+   ]
+  },
   {
    "cell_type": "code",
-   "execution_count": 6,
+   "execution_count": 5,
    "metadata": {},
    "outputs": [
     {
      "data": {
       "text/plain": [
-       "AIMessage(content='3 * 12 is 36 and 11 + 49 is 60.', response_metadata={'token_usage': {'completion_tokens': 18, 'prompt_tokens': 153, 'total_tokens': 171}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-ba5032f0-f773-406d-a408-8314e66511d0-0', usage_metadata={'input_tokens': 153, 'output_tokens': 18, 'total_tokens': 171})"
+       "AIMessage(content='The result of \\\\(3 \\\\times 12\\\\) is 36, and the result of \\\\(11 + 49\\\\) is 60.', response_metadata={'token_usage': {'completion_tokens': 31, 'prompt_tokens': 153, 'total_tokens': 184}, 'model_name': 'gpt-4o-mini-2024-07-18', 'system_fingerprint': 'fp_661538dc1f', 'finish_reason': 'stop', 'logprobs': None}, id='run-87d1ef0a-1223-4bb3-9310-7b591789323d-0', usage_metadata={'input_tokens': 153, 'output_tokens': 31, 'total_tokens': 184})"
       ]
      },
-     "execution_count": 6,
+     "execution_count": 5,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -129,15 +194,25 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Note that we pass back the same `id` in the `ToolMessage` as the what we receive from the model in order to help the model match tool responses with tool calls."
+    "Note that each `ToolMessage` must include a `tool_call_id` that matches an `id` in the original tool calls that the model generates. This helps the model match tool responses with tool calls.\n",
+    "\n",
+    "Tool calling agents, like those in [LangGraph](https://langchain-ai.github.io/langgraph/tutorials/introduction/), use this basic flow to answer queries and solve tasks.\n",
+    "\n",
+    "## Related\n",
+    "\n",
+    "- [LangGraph quickstart](https://langchain-ai.github.io/langgraph/tutorials/introduction/)\n",
+    "- Few shot prompting [with tools](/docs/how_to/tools_few_shot/)\n",
+    "- Stream [tool calls](/docs/how_to/tool_streaming/)\n",
+    "- Pass [runtime values to tools](/docs/how_to/tool_runtime)\n",
+    "- Getting [structured outputs](/docs/how_to/structured_output/) from models"
    ]
   }
  ],
  "metadata": {
   "kernelspec": {
-   "display_name": "poetry-venv-311",
+   "display_name": "Python 3",
    "language": "python",
-   "name": "poetry-venv-311"
+   "name": "python3"
   },
   "language_info": {
    "codemirror_mode": {
@@ -149,7 +224,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.11.9"
+   "version": "3.10.5"
   }
  },
  "nbformat": 4,
diff --git a/docs/docs/integrations/chat/groq.ipynb b/docs/docs/integrations/chat/groq.ipynb
index 01b1549a82d1b..556a9f208f42c 100644
--- a/docs/docs/integrations/chat/groq.ipynb
+++ b/docs/docs/integrations/chat/groq.ipynb
@@ -2,298 +2,259 @@
  "cells": [
   {
    "cell_type": "raw",
-   "metadata": {
-    "vscode": {
-     "languageId": "raw"
-    }
-   },
+   "id": "afaf8039",
+   "metadata": {},
    "source": [
     "---\n",
     "sidebar_label: Groq\n",
-    "keywords: [chatgroq]\n",
     "---"
    ]
   },
   {
    "cell_type": "markdown",
+   "id": "e49f1e0d",
    "metadata": {},
    "source": [
-    "# Groq\n",
+    "# ChatGroq\n",
+    "\n",
+    "This will help you getting started with Groq [chat models](../../concepts.mdx#chat-models). For detailed documentation of all ChatGroq features and configurations head to the [API reference](https://api.python.langchain.com/en/latest/chat_models/langchain_groq.chat_models.ChatGroq.html). For a list of all Groq models, visit this [link](https://console.groq.com/docs/models).\n",
+    "\n",
+    "## Overview\n",
+    "### Integration details\n",
     "\n",
-    "LangChain supports integration with [Groq](https://groq.com/) chat models. Groq specializes in fast AI inference.\n",
+    "| Class | Package | Local | Serializable | [JS support](https://js.langchain.com/v0.2/docs/integrations/chat/groq) | Package downloads | Package latest |\n",
+    "| :--- | :--- | :---: | :---: |  :---: | :---: | :---: |\n",
+    "| [ChatGroq](https://api.python.langchain.com/en/latest/chat_models/langchain_groq.chat_models.ChatGroq.html) | [langchain-groq](https://api.python.langchain.com/en/latest/groq_api_reference.html) | ❌ | beta | ✅ | ![PyPI - Downloads](https://img.shields.io/pypi/dm/langchain-groq?style=flat-square&label=%20) | ![PyPI - Version](https://img.shields.io/pypi/v/langchain-groq?style=flat-square&label=%20) |\n",
     "\n",
-    "To get started, you'll first need to install the langchain-groq package:"
+    "### Model features\n",
+    "| [Tool calling](../../how_to/tool_calling.ipynb) | [Structured output](../../how_to/structured_output.ipynb) | JSON mode | [Image input](../../how_to/multimodal_inputs.ipynb) | Audio input | Video input | [Token-level streaming](../../how_to/chat_streaming.ipynb) | Native async | [Token usage](../../how_to/chat_token_usage_tracking.ipynb) | [Logprobs](../../how_to/logprobs.ipynb) |\n",
+    "| :---: | :---: | :---: | :---: |  :---: | :---: | :---: | :---: | :---: | :---: |\n",
+    "| ✅ | ✅ | ✅ | ❌ | ❌ | ❌ | ✅ | ✅ | ✅ | ✅ | \n",
+    "\n",
+    "## Setup\n",
+    "\n",
+    "To access Groq models you'll need to create a Groq account, get an API key, and install the `langchain-groq` integration package.\n",
+    "\n",
+    "### Credentials\n",
+    "\n",
+    "Head to the [Groq console](https://console.groq.com/keys) to sign up to Groq and generate an API key. Once you've done this set the GROQ_API_KEY environment variable:"
    ]
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 1,
+   "id": "433e8d2b-9519-4b49-b2c4-7ab65b046c94",
    "metadata": {},
    "outputs": [],
    "source": [
-    "%pip install -qU langchain-groq"
+    "import getpass\n",
+    "import os\n",
+    "\n",
+    "os.environ[\"GROQ_API_KEY\"] = getpass.getpass(\"Enter your Groq API key: \")"
    ]
   },
   {
    "cell_type": "markdown",
+   "id": "72ee0c4b-9764-423a-9dbf-95129e185210",
    "metadata": {},
    "source": [
-    "Request an [API key](https://wow.groq.com) and set it as an environment variable:\n",
-    "\n",
-    "```bash\n",
-    "export GROQ_API_KEY=<YOUR API KEY>\n",
-    "```\n",
-    "\n",
-    "Alternatively, you may configure the API key when you initialize ChatGroq.\n",
-    "\n",
-    "Here's an example of it in action:"
+    "If you want to get automated tracing of your model calls you can also set your [LangSmith](https://docs.smith.langchain.com/) API key by uncommenting below:"
    ]
   },
   {
    "cell_type": "code",
-   "execution_count": 8,
+   "execution_count": 2,
+   "id": "a15d341e-3e26-4ca3-830b-5aab30ed66de",
    "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "AIMessage(content=\"Low latency is crucial for Large Language Models (LLMs) because it directly impacts the user experience, model performance, and overall efficiency. Here are some reasons why low latency is essential for LLMs:\\n\\n1. **Real-time Interaction**: LLMs are often used in applications that require real-time interaction, such as chatbots, virtual assistants, and language translation. Low latency ensures that the model responds quickly to user input, providing a seamless and engaging experience.\\n2. **Conversational Flow**: In conversational AI, latency can disrupt the natural flow of conversation. Low latency helps maintain a smooth conversation, allowing users to respond quickly and naturally, without feeling like they're waiting for the model to catch up.\\n3. **Model Performance**: High latency can lead to increased error rates, as the model may struggle to keep up with the input pace. Low latency enables the model to process information more efficiently, resulting in better accuracy and performance.\\n4. **Scalability**: As the number of users and requests increases, low latency becomes even more critical. It allows the model to handle a higher volume of requests without sacrificing performance, making it more scalable and efficient.\\n5. **Resource Utilization**: Low latency can reduce the computational resources required to process requests. By minimizing latency, you can optimize resource allocation, reduce costs, and improve overall system efficiency.\\n6. **User Experience**: High latency can lead to frustration, abandonment, and a poor user experience. Low latency ensures that users receive timely responses, which is essential for building trust and satisfaction.\\n7. **Competitive Advantage**: In applications like customer service or language translation, low latency can be a key differentiator. It can provide a competitive advantage by offering a faster and more responsive experience, setting your application apart from others.\\n8. **Edge Computing**: With the increasing adoption of edge computing, low latency is critical for processing data closer to the user. This reduces latency even further, enabling real-time processing and analysis of data.\\n9. **Real-time Analytics**: Low latency enables real-time analytics and insights, which are essential for applications like sentiment analysis, trend detection, and anomaly detection.\\n10. **Future-Proofing**: As LLMs continue to evolve and become more complex, low latency will become even more critical. By prioritizing low latency now, you'll be better prepared to handle the demands of future LLM applications.\\n\\nIn summary, low latency is vital for LLMs because it ensures a seamless user experience, improves model performance, and enables efficient resource utilization. By prioritizing low latency, you can build more effective, scalable, and efficient LLM applications that meet the demands of real-time interaction and processing.\", response_metadata={'token_usage': {'completion_tokens': 541, 'prompt_tokens': 33, 'total_tokens': 574, 'completion_time': 1.499777658, 'prompt_time': 0.008344704, 'queue_time': None, 'total_time': 1.508122362}, 'model_name': 'llama3-70b-8192', 'system_fingerprint': 'fp_87cbfbbc4d', 'finish_reason': 'stop', 'logprobs': None}, id='run-49dad960-ace8-4cd7-90b3-2db99ecbfa44-0')"
-      ]
-     },
-     "execution_count": 8,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
+   "outputs": [],
    "source": [
-    "from langchain_core.prompts import ChatPromptTemplate\n",
-    "from langchain_groq import ChatGroq\n",
-    "\n",
-    "chat = ChatGroq(\n",
-    "    temperature=0,\n",
-    "    model=\"llama3-70b-8192\",\n",
-    "    # api_key=\"\" # Optional if not set as an environment variable\n",
-    ")\n",
-    "\n",
-    "system = \"You are a helpful assistant.\"\n",
-    "human = \"{text}\"\n",
-    "prompt = ChatPromptTemplate.from_messages([(\"system\", system), (\"human\", human)])\n",
-    "\n",
-    "chain = prompt | chat\n",
-    "chain.invoke({\"text\": \"Explain the importance of low latency for LLMs.\"})"
+    "# os.environ[\"LANGSMITH_API_KEY\"] = getpass.getpass(\"Enter your LangSmith API key: \")\n",
+    "# os.environ[\"LANGSMITH_TRACING\"] = \"true\""
    ]
   },
   {
    "cell_type": "markdown",
+   "id": "0730d6a1-c893-4840-9817-5e5251676d5d",
    "metadata": {},
    "source": [
-    "You can view the available models [here](https://console.groq.com/docs/models).\n",
-    "\n",
-    "## Tool calling\n",
-    "\n",
-    "Groq chat models support [tool calling](/docs/how_to/tool_calling) to generate output matching a specific schema. The model may choose to call multiple tools or the same tool multiple times if appropriate.\n",
+    "### Installation\n",
     "\n",
-    "Here's an example:"
+    "The LangChain Groq integration lives in the `langchain-groq` package:"
    ]
   },
   {
    "cell_type": "code",
-   "execution_count": 10,
+   "execution_count": 3,
+   "id": "652d6238-1f87-422a-b135-f5abbb8652fc",
    "metadata": {},
    "outputs": [
     {
-     "data": {
-      "text/plain": [
-       "[{'name': 'get_current_weather',\n",
-       "  'args': {'location': 'San Francisco', 'unit': 'Celsius'},\n",
-       "  'id': 'call_pydj'},\n",
-       " {'name': 'get_current_weather',\n",
-       "  'args': {'location': 'Tokyo', 'unit': 'Celsius'},\n",
-       "  'id': 'call_jgq3'}]"
-      ]
-     },
-     "execution_count": 10,
-     "metadata": {},
-     "output_type": "execute_result"
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "\n",
+      "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m A new release of pip is available: \u001b[0m\u001b[31;49m24.0\u001b[0m\u001b[39;49m -> \u001b[0m\u001b[32;49m24.1.2\u001b[0m\n",
+      "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m To update, run: \u001b[0m\u001b[32;49mpip install --upgrade pip\u001b[0m\n",
+      "Note: you may need to restart the kernel to use updated packages.\n"
+     ]
     }
    ],
    "source": [
-    "from typing import Optional\n",
-    "\n",
-    "from langchain_core.tools import tool\n",
-    "\n",
-    "\n",
-    "@tool\n",
-    "def get_current_weather(location: str, unit: Optional[str]):\n",
-    "    \"\"\"Get the current weather in a given location\"\"\"\n",
-    "    return \"Cloudy with a chance of rain.\"\n",
-    "\n",
-    "\n",
-    "tool_model = chat.bind_tools([get_current_weather], tool_choice=\"auto\")\n",
-    "\n",
-    "res = tool_model.invoke(\"What is the weather like in San Francisco and Tokyo?\")\n",
-    "\n",
-    "res.tool_calls"
+    "%pip install -qU langchain-groq"
    ]
   },
   {
    "cell_type": "markdown",
+   "id": "a38cde65-254d-4219-a441-068766c0d4b5",
    "metadata": {},
    "source": [
-    "### `.with_structured_output()`\n",
+    "## Instantiation\n",
     "\n",
-    "You can also use the convenience [`.with_structured_output()`](/docs/how_to/structured_output/#the-with_structured_output-method) method to coerce `ChatGroq` into returning a structured output.\n",
-    "Here is an example:"
+    "Now we can instantiate our model object and generate chat completions:"
    ]
   },
   {
    "cell_type": "code",
-   "execution_count": 11,
+   "execution_count": 4,
+   "id": "cb09c344-1836-4e0c-acf8-11d13ac1dbae",
    "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "Joke(setup='Why did the cat join a band?', punchline='Because it wanted to be the purr-cussionist!', rating=None)"
-      ]
-     },
-     "execution_count": 11,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
+   "outputs": [],
    "source": [
-    "from langchain_core.pydantic_v1 import BaseModel, Field\n",
-    "\n",
-    "\n",
-    "class Joke(BaseModel):\n",
-    "    \"\"\"Joke to tell user.\"\"\"\n",
-    "\n",
-    "    setup: str = Field(description=\"The setup of the joke\")\n",
-    "    punchline: str = Field(description=\"The punchline to the joke\")\n",
-    "    rating: Optional[int] = Field(description=\"How funny the joke is, from 1 to 10\")\n",
-    "\n",
-    "\n",
-    "structured_llm = chat.with_structured_output(Joke)\n",
+    "from langchain_groq import ChatGroq\n",
     "\n",
-    "structured_llm.invoke(\"Tell me a joke about cats\")"
+    "llm = ChatGroq(\n",
+    "    model=\"mixtral-8x7b-32768\",\n",
+    "    temperature=0,\n",
+    "    max_tokens=None,\n",
+    "    timeout=None,\n",
+    "    max_retries=2,\n",
+    "    # other params...\n",
+    ")"
    ]
   },
   {
    "cell_type": "markdown",
+   "id": "2b4f3e15",
    "metadata": {},
    "source": [
-    "Behind the scenes, this takes advantage of the above tool calling functionality.\n",
-    "\n",
-    "## Async"
+    "## Invocation"
    ]
   },
   {
    "cell_type": "code",
-   "execution_count": 12,
-   "metadata": {},
+   "execution_count": 5,
+   "id": "62e0dbc3",
+   "metadata": {
+    "tags": []
+   },
    "outputs": [
     {
      "data": {
       "text/plain": [
-       "AIMessage(content='Here is a limerick about the sun:\\n\\nThere once was a sun in the sky,\\nWhose warmth and light caught the eye,\\nIt shone bright and bold,\\nWith a fiery gold,\\nAnd brought life to all, as it flew by.', response_metadata={'token_usage': {'completion_tokens': 51, 'prompt_tokens': 18, 'total_tokens': 69, 'completion_time': 0.144614022, 'prompt_time': 0.00585394, 'queue_time': None, 'total_time': 0.150467962}, 'model_name': 'llama3-70b-8192', 'system_fingerprint': 'fp_2f30b0b571', 'finish_reason': 'stop', 'logprobs': None}, id='run-e42340ba-f0ad-4b54-af61-8308d8ec8256-0')"
+       "AIMessage(content='I enjoy programming. (The French translation is: \"J\\'aime programmer.\")\\n\\nNote: I chose to translate \"I love programming\" as \"J\\'aime programmer\" instead of \"Je suis amoureux de programmer\" because the latter has a romantic connotation that is not present in the original English sentence.', response_metadata={'token_usage': {'completion_tokens': 73, 'prompt_tokens': 31, 'total_tokens': 104, 'completion_time': 0.1140625, 'prompt_time': 0.003352463, 'queue_time': None, 'total_time': 0.117414963}, 'model_name': 'mixtral-8x7b-32768', 'system_fingerprint': 'fp_c5f20b5bb1', 'finish_reason': 'stop', 'logprobs': None}, id='run-64433c19-eadf-42fc-801e-3071e3c40160-0', usage_metadata={'input_tokens': 31, 'output_tokens': 73, 'total_tokens': 104})"
       ]
      },
-     "execution_count": 12,
+     "execution_count": 5,
      "metadata": {},
      "output_type": "execute_result"
     }
    ],
    "source": [
-    "chat = ChatGroq(temperature=0, model=\"llama3-70b-8192\")\n",
-    "prompt = ChatPromptTemplate.from_messages([(\"human\", \"Write a Limerick about {topic}\")])\n",
-    "chain = prompt | chat\n",
-    "await chain.ainvoke({\"topic\": \"The Sun\"})"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## Streaming"
+    "messages = [\n",
+    "    (\n",
+    "        \"system\",\n",
+    "        \"You are a helpful assistant that translates English to French. Translate the user sentence.\",\n",
+    "    ),\n",
+    "    (\"human\", \"I love programming.\"),\n",
+    "]\n",
+    "ai_msg = llm.invoke(messages)\n",
+    "ai_msg"
    ]
   },
   {
    "cell_type": "code",
-   "execution_count": 13,
+   "execution_count": 6,
+   "id": "d86145b3-bfef-46e8-b227-4dda5c9c2705",
    "metadata": {},
    "outputs": [
     {
      "name": "stdout",
      "output_type": "stream",
      "text": [
-      "Silvery glow bright\n",
-      "Luna's gentle light shines down\n",
-      "Midnight's gentle queen"
+      "I enjoy programming. (The French translation is: \"J'aime programmer.\")\n",
+      "\n",
+      "Note: I chose to translate \"I love programming\" as \"J'aime programmer\" instead of \"Je suis amoureux de programmer\" because the latter has a romantic connotation that is not present in the original English sentence.\n"
      ]
     }
    ],
    "source": [
-    "chat = ChatGroq(temperature=0, model=\"llama3-70b-8192\")\n",
-    "prompt = ChatPromptTemplate.from_messages([(\"human\", \"Write a haiku about {topic}\")])\n",
-    "chain = prompt | chat\n",
-    "for chunk in chain.stream({\"topic\": \"The Moon\"}):\n",
-    "    print(chunk.content, end=\"\", flush=True)"
+    "print(ai_msg.content)"
    ]
   },
   {
    "cell_type": "markdown",
+   "id": "18e2bfc0-7e78-4528-a73f-499ac150dca8",
    "metadata": {},
    "source": [
-    "## Passing custom parameters\n",
+    "## Chaining\n",
     "\n",
-    "You can pass other Groq-specific parameters using the `model_kwargs` argument on initialization. Here's an example of enabling JSON mode:"
+    "We can [chain](../../how_to/sequence.ipynb) our model with a prompt template like so:"
    ]
   },
   {
    "cell_type": "code",
-   "execution_count": 15,
+   "execution_count": 7,
+   "id": "e197d1d7-a070-4c96-9f8a-a0e86d046e0b",
    "metadata": {},
    "outputs": [
     {
      "data": {
       "text/plain": [
-       "AIMessage(content='{ \"response\": \"That\\'s a tough question! There are eight species of bears found in the world, and each one is unique and amazing in its own way. However, if I had to pick one, I\\'d say the giant panda is a popular favorite among many people. Who can resist those adorable black and white markings?\", \"followup_question\": \"Would you like to know more about the giant panda\\'s habitat and diet?\" }', response_metadata={'token_usage': {'completion_tokens': 89, 'prompt_tokens': 50, 'total_tokens': 139, 'completion_time': 0.249032839, 'prompt_time': 0.011134497, 'queue_time': None, 'total_time': 0.260167336}, 'model_name': 'llama3-70b-8192', 'system_fingerprint': 'fp_2f30b0b571', 'finish_reason': 'stop', 'logprobs': None}, id='run-558ce67e-8c63-43fe-a48f-6ecf181bc922-0')"
+       "AIMessage(content='That\\'s great! I can help you translate English phrases related to programming into German.\\n\\n\"I love programming\" can be translated as \"Ich liebe Programmieren\" in German.\\n\\nHere are some more programming-related phrases translated into German:\\n\\n* \"Programming language\" = \"Programmiersprache\"\\n* \"Code\" = \"Code\"\\n* \"Variable\" = \"Variable\"\\n* \"Function\" = \"Funktion\"\\n* \"Array\" = \"Array\"\\n* \"Object-oriented programming\" = \"Objektorientierte Programmierung\"\\n* \"Algorithm\" = \"Algorithmus\"\\n* \"Data structure\" = \"Datenstruktur\"\\n* \"Debugging\" = \"Fehlersuche\"\\n* \"Compile\" = \"Kompilieren\"\\n* \"Link\" = \"Verknüpfen\"\\n* \"Run\" = \"Ausführen\"\\n* \"Test\" = \"Testen\"\\n* \"Deploy\" = \"Bereitstellen\"\\n* \"Version control\" = \"Versionskontrolle\"\\n* \"Open source\" = \"Open Source\"\\n* \"Software development\" = \"Softwareentwicklung\"\\n* \"Agile methodology\" = \"Agile Methodik\"\\n* \"DevOps\" = \"DevOps\"\\n* \"Cloud computing\" = \"Cloud Computing\"\\n\\nI hope this helps! Let me know if you have any other questions or if you need further translations.', response_metadata={'token_usage': {'completion_tokens': 331, 'prompt_tokens': 25, 'total_tokens': 356, 'completion_time': 0.520006542, 'prompt_time': 0.00250165, 'queue_time': None, 'total_time': 0.522508192}, 'model_name': 'mixtral-8x7b-32768', 'system_fingerprint': 'fp_c5f20b5bb1', 'finish_reason': 'stop', 'logprobs': None}, id='run-74207fb7-85d3-417d-b2b9-621116b75d41-0', usage_metadata={'input_tokens': 25, 'output_tokens': 331, 'total_tokens': 356})"
       ]
      },
-     "execution_count": 15,
+     "execution_count": 7,
      "metadata": {},
      "output_type": "execute_result"
     }
    ],
    "source": [
-    "chat = ChatGroq(\n",
-    "    model=\"llama3-70b-8192\", model_kwargs={\"response_format\": {\"type\": \"json_object\"}}\n",
-    ")\n",
-    "\n",
-    "system = \"\"\"\n",
-    "You are a helpful assistant.\n",
-    "Always respond with a JSON object with two string keys: \"response\" and \"followup_question\".\n",
-    "\"\"\"\n",
-    "human = \"{question}\"\n",
-    "prompt = ChatPromptTemplate.from_messages([(\"system\", system), (\"human\", human)])\n",
+    "from langchain_core.prompts import ChatPromptTemplate\n",
     "\n",
-    "chain = prompt | chat\n",
+    "prompt = ChatPromptTemplate.from_messages(\n",
+    "    [\n",
+    "        (\n",
+    "            \"system\",\n",
+    "            \"You are a helpful assistant that translates {input_language} to {output_language}.\",\n",
+    "        ),\n",
+    "        (\"human\", \"{input}\"),\n",
+    "    ]\n",
+    ")\n",
     "\n",
-    "chain.invoke({\"question\": \"what bear is best?\"})"
+    "chain = prompt | llm\n",
+    "chain.invoke(\n",
+    "    {\n",
+    "        \"input_language\": \"English\",\n",
+    "        \"output_language\": \"German\",\n",
+    "        \"input\": \"I love programming.\",\n",
+    "    }\n",
+    ")"
    ]
   },
   {
-   "cell_type": "code",
-   "execution_count": null,
+   "cell_type": "markdown",
+   "id": "3a5bb5ca-c3ae-4a58-be67-2cd18574b9a3",
    "metadata": {},
-   "outputs": [],
-   "source": []
+   "source": [
+    "## API reference\n",
+    "\n",
+    "For detailed documentation of all ChatGroq features and configurations head to the API reference: https://api.python.langchain.com/en/latest/chat_models/langchain_groq.chat_models.ChatGroq.html"
+   ]
   }
  ],
  "metadata": {
   "kernelspec": {
-   "display_name": ".venv",
+   "display_name": "Python 3 (ipykernel)",
    "language": "python",
    "name": "python3"
   },
@@ -307,9 +268,9 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.10.5"
+   "version": "3.11.9"
   }
  },
  "nbformat": 4,
- "nbformat_minor": 2
+ "nbformat_minor": 5
 }
diff --git a/docs/docs/integrations/chat/nvidia_ai_endpoints.ipynb b/docs/docs/integrations/chat/nvidia_ai_endpoints.ipynb
index 8d8b3b82628a2..24554ac80e316 100644
--- a/docs/docs/integrations/chat/nvidia_ai_endpoints.ipynb
+++ b/docs/docs/integrations/chat/nvidia_ai_endpoints.ipynb
@@ -302,9 +302,6 @@
     "\n",
     "NVIDIA also supports multimodal inputs, meaning you can provide both images and text for the model to reason over. An example model supporting multimodal inputs is `nvidia/neva-22b`.\n",
     "\n",
-    "\n",
-    "These models accept LangChain's standard image formats, and accept `labels`, similar to the Steering LLMs above. In addition to `creativity`, `complexity`, and `verbosity`, these models support a `quality` toggle.\n",
-    "\n",
     "Below is an example use:"
    ]
   },
@@ -447,92 +444,6 @@
     "llm.invoke(f'What\\'s in this image?\\n<img src=\"{base64_with_mime_type}\" />')"
    ]
   },
-  {
-   "cell_type": "markdown",
-   "id": "3e61d868",
-   "metadata": {},
-   "source": [
-    "#### **Advanced Use Case:** Forcing Payload \n",
-    "\n",
-    "You may notice that some newer models may have strong parameter expectations that the LangChain connector may not support by default. For example, we cannot invoke the [Kosmos](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-foundation/models/kosmos-2) model at the time of this notebook's latest release due to the lack of a streaming argument on the server side: "
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "d143e0d6",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "from langchain_nvidia_ai_endpoints import ChatNVIDIA\n",
-    "\n",
-    "kosmos = ChatNVIDIA(model=\"microsoft/kosmos-2\")\n",
-    "\n",
-    "from langchain_core.messages import HumanMessage\n",
-    "\n",
-    "# kosmos.invoke(\n",
-    "#     [\n",
-    "#         HumanMessage(\n",
-    "#             content=[\n",
-    "#                 {\"type\": \"text\", \"text\": \"Describe this image:\"},\n",
-    "#                 {\"type\": \"image_url\", \"image_url\": {\"url\": image_url}},\n",
-    "#             ]\n",
-    "#         )\n",
-    "#     ]\n",
-    "# )\n",
-    "\n",
-    "# Exception: [422] Unprocessable Entity\n",
-    "# body -> stream\n",
-    "#   Extra inputs are not permitted (type=extra_forbidden)\n",
-    "# RequestID: 35538c9a-4b45-4616-8b75-7ef816fccf38"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "1e230b70",
-   "metadata": {},
-   "source": [
-    "For a simple use case like this, we can actually try to force the payload argument of our underlying client by specifying the `payload_fn` function as follows: "
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "0925b2b1",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "def drop_streaming_key(d):\n",
-    "    \"\"\"Takes in payload dictionary, outputs new payload dictionary\"\"\"\n",
-    "    if \"stream\" in d:\n",
-    "        d.pop(\"stream\")\n",
-    "    return d\n",
-    "\n",
-    "\n",
-    "## Override the payload passthrough. Default is to pass through the payload as is.\n",
-    "kosmos = ChatNVIDIA(model=\"microsoft/kosmos-2\")\n",
-    "kosmos.client.payload_fn = drop_streaming_key\n",
-    "\n",
-    "kosmos.invoke(\n",
-    "    [\n",
-    "        HumanMessage(\n",
-    "            content=[\n",
-    "                {\"type\": \"text\", \"text\": \"Describe this image:\"},\n",
-    "                {\"type\": \"image_url\", \"image_url\": {\"url\": image_url}},\n",
-    "            ]\n",
-    "        )\n",
-    "    ]\n",
-    ")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "fe6e1758",
-   "metadata": {},
-   "source": [
-    "For more advanced or custom use-cases (i.e. supporting the diffusion models), you may be interested in leveraging the `NVEModel` client as a requests backbone. The `NVIDIAEmbeddings` class is a good source of inspiration for this. "
-   ]
-  },
   {
    "cell_type": "markdown",
    "id": "137662a6",
@@ -540,7 +451,7 @@
     "id": "137662a6"
    },
    "source": [
-    "## Example usage within RunnableWithMessageHistory "
+    "## Example usage within a RunnableWithMessageHistory"
    ]
   },
   {
@@ -630,14 +541,14 @@
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "uHIMZxVSVNBC",
+   "id": "LyD1xVKmVSs4",
    "metadata": {
     "colab": {
      "base_uri": "https://localhost:8080/",
-     "height": 284
+     "height": 350
     },
-    "id": "uHIMZxVSVNBC",
-    "outputId": "79acc89d-a820-4f2c-bac2-afe99da95580"
+    "id": "LyD1xVKmVSs4",
+    "outputId": "a1714513-a8fd-4d14-f974-233e39d5c4f5"
    },
    "outputs": [],
    "source": [
@@ -646,6 +557,79 @@
     "    config=config,\n",
     ")"
    ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "f3cbbba0",
+   "metadata": {},
+   "source": [
+    "## Tool calling\n",
+    "\n",
+    "Starting in v0.2, `ChatNVIDIA` supports [bind_tools](https://api.python.langchain.com/en/latest/language_models/langchain_core.language_models.chat_models.BaseChatModel.html#langchain_core.language_models.chat_models.BaseChatModel.bind_tools).\n",
+    "\n",
+    "`ChatNVIDIA` provides integration with the variety of models on [build.nvidia.com](https://build.nvidia.com) as well as local NIMs. Not all these models are trained for tool calling. Be sure to select a model that does have tool calling for your experimention and applications."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "6f7b535e",
+   "metadata": {},
+   "source": [
+    "You can get a list of models that are known to support tool calling with,"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "e36c8911",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "tool_models = [\n",
+    "    model for model in ChatNVIDIA.get_available_models() if model.supports_tools\n",
+    "]\n",
+    "tool_models"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "b01d75a7",
+   "metadata": {},
+   "source": [
+    "With a tool capable model,"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "bd54f174",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain_core.pydantic_v1 import Field\n",
+    "from langchain_core.tools import tool\n",
+    "\n",
+    "\n",
+    "@tool\n",
+    "def get_current_weather(\n",
+    "    location: str = Field(..., description=\"The location to get the weather for.\"),\n",
+    "):\n",
+    "    \"\"\"Get the current weather for a location.\"\"\"\n",
+    "    ...\n",
+    "\n",
+    "\n",
+    "llm = ChatNVIDIA(model=tool_models[0].id).bind_tools(tools=[get_current_weather])\n",
+    "response = llm.invoke(\"What is the weather in Boston?\")\n",
+    "response.tool_calls"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "e08df68c",
+   "metadata": {},
+   "source": [
+    "See [How to use chat models to call tools](https://python.langchain.com/v0.2/docs/how_to/tool_calling/) for additional examples."
+   ]
   }
  ],
  "metadata": {
@@ -667,7 +651,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.10.2"
+   "version": "3.10.13"
   }
  },
  "nbformat": 4,
diff --git a/docs/docs/integrations/chat/ollama.ipynb b/docs/docs/integrations/chat/ollama.ipynb
index 05e1723946098..f59474094a716 100644
--- a/docs/docs/integrations/chat/ollama.ipynb
+++ b/docs/docs/integrations/chat/ollama.ipynb
@@ -35,7 +35,7 @@
     "### Model features\n",
     "| [Tool calling](/docs/how_to/tool_calling/) | [Structured output](/docs/how_to/structured_output/) | JSON mode | [Image input](/docs/how_to/multimodal_inputs/) | Audio input | Video input | [Token-level streaming](/docs/how_to/chat_streaming/) | Native async | [Token usage](/docs/how_to/chat_token_usage_tracking/) | [Logprobs](/docs/how_to/logprobs/) |\n",
     "| :---: | :---: | :---: | :---: |  :---: | :---: | :---: | :---: | :---: | :---: |\n",
-    "| ❌ | ❌ | ✅ | ❌ | ❌ | ❌ | ✅ | ✅ | ❌ | ❌ | \n",
+    "| ✅ | ❌ | ✅ | ❌ | ❌ | ❌ | ✅ | ✅ | ❌ | ❌ | \n",
     "\n",
     "## Setup\n",
     "\n",
diff --git a/docs/docs/integrations/chat/ollama_functions.ipynb b/docs/docs/integrations/chat/ollama_functions.ipynb
index 07039e05fdfe2..96dc9f3f2315b 100644
--- a/docs/docs/integrations/chat/ollama_functions.ipynb
+++ b/docs/docs/integrations/chat/ollama_functions.ipynb
@@ -284,7 +284,9 @@
   {
    "cell_type": "markdown",
    "metadata": {},
-   "source": "For more on binding tools and tool call outputs, head to the [tool calling](docs/how_to/function_calling) docs."
+   "source": [
+    "For more on binding tools and tool call outputs, head to the [tool calling](../../how_to/function_calling.ipynb) docs."
+   ]
   },
   {
    "cell_type": "markdown",
diff --git a/docs/docs/integrations/chat/together.ipynb b/docs/docs/integrations/chat/together.ipynb
index 4a3b07e57d149..87a0f1c39e1d0 100644
--- a/docs/docs/integrations/chat/together.ipynb
+++ b/docs/docs/integrations/chat/together.ipynb
@@ -1,103 +1,263 @@
 {
  "cells": [
+  {
+   "cell_type": "raw",
+   "id": "afaf8039",
+   "metadata": {},
+   "source": [
+    "---\n",
+    "sidebar_label: Together\n",
+    "---"
+   ]
+  },
   {
    "cell_type": "markdown",
-   "id": "2970dd75-8ebf-4b51-8282-9b454b8f356d",
+   "id": "e49f1e0d",
    "metadata": {},
    "source": [
-    "# Together AI\n",
+    "# ChatTogether\n",
+    "\n",
+    "\n",
+    "This page will help you get started with Together AI [chat models](../../concepts.mdx#chat-models). For detailed documentation of all ChatTogether features and configurations head to the [API reference](https://api.python.langchain.com/en/latest/chat_models/langchain_together.chat_models.ChatTogether.html).\n",
+    "\n",
+    "[Together AI](https://www.together.ai/) offers an API to query [50+ leading open-source models](https://docs.together.ai/docs/chat-models)\n",
+    "\n",
+    "## Overview\n",
+    "### Integration details\n",
     "\n",
-    "[Together AI](https://www.together.ai/) offers an API to query [50+ leading open-source models](https://docs.together.ai/docs/inference-models) in a couple lines of code.\n",
+    "| Class | Package | Local | Serializable | [JS support](https://js.langchain.com/v0.2/docs/integrations/chat/togetherai) | Package downloads | Package latest |\n",
+    "| :--- | :--- | :---: | :---: |  :---: | :---: | :---: |\n",
+    "| [ChatTogether](https://api.python.langchain.com/en/latest/chat_models/langchain_together.chat_models.ChatTogether.html) | [langchain-together](https://api.python.langchain.com/en/latest/together_api_reference.html) | ❌ | beta | ✅ | ![PyPI - Downloads](https://img.shields.io/pypi/dm/langchain-together?style=flat-square&label=%20) | ![PyPI - Version](https://img.shields.io/pypi/v/langchain-together?style=flat-square&label=%20) |\n",
     "\n",
-    "This example goes over how to use LangChain to interact with Together AI models."
+    "### Model features\n",
+    "| [Tool calling](../../how_to/tool_calling.ipynb) | [Structured output](../../how_to/structured_output.ipynb) | JSON mode | [Image input](../../how_to/multimodal_inputs.ipynb) | Audio input | Video input | [Token-level streaming](../../how_to/chat_streaming.ipynb) | Native async | [Token usage](../../how_to/chat_token_usage_tracking.ipynb) | [Logprobs](../../how_to/logprobs.ipynb) |\n",
+    "| :---: | :---: | :---: | :---: |  :---: | :---: | :---: | :---: | :---: | :---: |\n",
+    "| ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ | ✅ | ✅ | \n",
+    "\n",
+    "## Setup\n",
+    "\n",
+    "To access Together models you'll need to create a/an Together account, get an API key, and install the `langchain-together` integration package.\n",
+    "\n",
+    "### Credentials\n",
+    "\n",
+    "Head to [this page](https://api.together.ai) to sign up to Together and generate an API key. Once you've done this set the TOGETHER_API_KEY environment variable:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "id": "433e8d2b-9519-4b49-b2c4-7ab65b046c94",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import getpass\n",
+    "import os\n",
+    "\n",
+    "os.environ[\"TOGETHER_API_KEY\"] = getpass.getpass(\"Enter your Together API key: \")"
    ]
   },
   {
    "cell_type": "markdown",
-   "id": "1c47fc36",
+   "id": "72ee0c4b-9764-423a-9dbf-95129e185210",
    "metadata": {},
    "source": [
-    "## Installation"
+    "If you want to get automated tracing of your model calls you can also set your [LangSmith](https://docs.smith.langchain.com/) API key by uncommenting below:"
    ]
   },
   {
    "cell_type": "code",
-   "execution_count": null,
-   "id": "1ecdb29d",
+   "execution_count": 2,
+   "id": "a15d341e-3e26-4ca3-830b-5aab30ed66de",
    "metadata": {},
    "outputs": [],
    "source": [
-    "%pip install --upgrade langchain-together"
+    "# os.environ[\"LANGSMITH_API_KEY\"] = getpass.getpass(\"Enter your LangSmith API key: \")\n",
+    "# os.environ[\"LANGSMITH_TRACING\"] = \"true\""
    ]
   },
   {
    "cell_type": "markdown",
-   "id": "89883202",
+   "id": "0730d6a1-c893-4840-9817-5e5251676d5d",
    "metadata": {},
    "source": [
-    "## Environment\n",
+    "### Installation\n",
     "\n",
-    "To use Together AI, you'll need an API key which you can find here:\n",
-    "https://api.together.ai/settings/api-keys. This can be passed in as an init param\n",
-    "``together_api_key`` or set as environment variable ``TOGETHER_API_KEY``.\n"
+    "The LangChain Together integration lives in the `langchain-together` package:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "id": "652d6238-1f87-422a-b135-f5abbb8652fc",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "\n",
+      "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m A new release of pip is available: \u001b[0m\u001b[31;49m24.0\u001b[0m\u001b[39;49m -> \u001b[0m\u001b[32;49m24.1.2\u001b[0m\n",
+      "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m To update, run: \u001b[0m\u001b[32;49mpip install --upgrade pip\u001b[0m\n",
+      "Note: you may need to restart the kernel to use updated packages.\n"
+     ]
+    }
+   ],
+   "source": [
+    "%pip install -qU langchain-together"
    ]
   },
   {
    "cell_type": "markdown",
-   "id": "8304b4d9",
+   "id": "a38cde65-254d-4219-a441-068766c0d4b5",
    "metadata": {},
    "source": [
-    "## Example"
+    "## Instantiation\n",
+    "\n",
+    "Now we can instantiate our model object and generate chat completions:\n",
+    "\n",
+    "- TODO: Update model instantiation with relevant params."
    ]
   },
   {
    "cell_type": "code",
-   "execution_count": null,
-   "id": "637bb53f",
+   "execution_count": 5,
+   "id": "cb09c344-1836-4e0c-acf8-11d13ac1dbae",
    "metadata": {},
    "outputs": [],
    "source": [
-    "# Querying chat models with Together AI\n",
-    "\n",
     "from langchain_together import ChatTogether\n",
     "\n",
-    "# choose from our 50+ models here: https://docs.together.ai/docs/inference-models\n",
-    "chat = ChatTogether(\n",
-    "    # together_api_key=\"YOUR_API_KEY\",\n",
+    "llm = ChatTogether(\n",
     "    model=\"meta-llama/Llama-3-70b-chat-hf\",\n",
-    ")\n",
-    "\n",
-    "# stream the response back from the model\n",
-    "for m in chat.stream(\"Tell me fun things to do in NYC\"):\n",
-    "    print(m.content, end=\"\", flush=True)\n",
-    "\n",
-    "# if you don't want to do streaming, you can use the invoke method\n",
-    "# chat.invoke(\"Tell me fun things to do in NYC\")"
+    "    temperature=0,\n",
+    "    max_tokens=None,\n",
+    "    timeout=None,\n",
+    "    max_retries=2,\n",
+    "    # other params...\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "2b4f3e15",
+   "metadata": {},
+   "source": [
+    "## Invocation"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "id": "62e0dbc3",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "AIMessage(content=\"J'adore la programmation.\", response_metadata={'token_usage': {'completion_tokens': 9, 'prompt_tokens': 35, 'total_tokens': 44}, 'model_name': 'meta-llama/Llama-3-70b-chat-hf', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-79efa49b-dbaf-4ef8-9dce-958533823ef6-0', usage_metadata={'input_tokens': 35, 'output_tokens': 9, 'total_tokens': 44})"
+      ]
+     },
+     "execution_count": 6,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "messages = [\n",
+    "    (\n",
+    "        \"system\",\n",
+    "        \"You are a helpful assistant that translates English to French. Translate the user sentence.\",\n",
+    "    ),\n",
+    "    (\"human\", \"I love programming.\"),\n",
+    "]\n",
+    "ai_msg = llm.invoke(messages)\n",
+    "ai_msg"
    ]
   },
   {
    "cell_type": "code",
-   "execution_count": null,
-   "id": "e7b7170d-d7c5-4890-9714-a37238343805",
+   "execution_count": 7,
+   "id": "d86145b3-bfef-46e8-b227-4dda5c9c2705",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "J'adore la programmation.\n"
+     ]
+    }
+   ],
+   "source": [
+    "print(ai_msg.content)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "18e2bfc0-7e78-4528-a73f-499ac150dca8",
    "metadata": {},
-   "outputs": [],
    "source": [
-    "# Querying code and language models with Together AI\n",
+    "## Chaining\n",
     "\n",
-    "from langchain_together import Together\n",
+    "We can [chain](../../how_to/sequence.ipynb) our model with a prompt template like so:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 8,
+   "id": "e197d1d7-a070-4c96-9f8a-a0e86d046e0b",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "AIMessage(content='Ich liebe das Programmieren.', response_metadata={'token_usage': {'completion_tokens': 7, 'prompt_tokens': 30, 'total_tokens': 37}, 'model_name': 'meta-llama/Llama-3-70b-chat-hf', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-80bba5fa-1723-4242-8d5a-c09b76b8350b-0', usage_metadata={'input_tokens': 30, 'output_tokens': 7, 'total_tokens': 37})"
+      ]
+     },
+     "execution_count": 8,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "from langchain_core.prompts import ChatPromptTemplate\n",
     "\n",
-    "llm = Together(\n",
-    "    model=\"codellama/CodeLlama-70b-Python-hf\",\n",
-    "    # together_api_key=\"...\"\n",
+    "prompt = ChatPromptTemplate.from_messages(\n",
+    "    [\n",
+    "        (\n",
+    "            \"system\",\n",
+    "            \"You are a helpful assistant that translates {input_language} to {output_language}.\",\n",
+    "        ),\n",
+    "        (\"human\", \"{input}\"),\n",
+    "    ]\n",
     ")\n",
     "\n",
-    "print(llm.invoke(\"def bubble_sort(): \"))"
+    "chain = prompt | llm\n",
+    "chain.invoke(\n",
+    "    {\n",
+    "        \"input_language\": \"English\",\n",
+    "        \"output_language\": \"German\",\n",
+    "        \"input\": \"I love programming.\",\n",
+    "    }\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "3a5bb5ca-c3ae-4a58-be67-2cd18574b9a3",
+   "metadata": {},
+   "source": [
+    "## API reference\n",
+    "\n",
+    "For detailed documentation of all ChatTogether features and configurations head to the API reference: https://api.python.langchain.com/en/latest/chat_models/langchain_together.chat_models.ChatTogether.html"
    ]
   }
  ],
  "metadata": {
   "kernelspec": {
-   "display_name": ".venv",
+   "display_name": "Python 3 (ipykernel)",
    "language": "python",
    "name": "python3"
   },
@@ -111,7 +271,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.11.4"
+   "version": "3.11.9"
   }
  },
  "nbformat": 4,
diff --git a/docs/docs/integrations/document_loaders/scrapingant.ipynb b/docs/docs/integrations/document_loaders/scrapingant.ipynb
new file mode 100644
index 0000000000000..46de054f4c3d6
--- /dev/null
+++ b/docs/docs/integrations/document_loaders/scrapingant.ipynb
@@ -0,0 +1,188 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "---\n",
+    "sidebar_label: ScrapingAnt\n",
+    "---\n",
+    "\n",
+    "# ScrapingAnt\n",
+    "## Overview\n",
+    "[ScrapingAnt](https://scrapingant.com/) is a web scraping API with headless browser capabilities, proxies, and anti-bot bypass. It allows for extracting web page data into accessible LLM markdown.\n",
+    "\n",
+    "This particular integration uses only Markdown extraction feature, but don't hesitate to [reach out to us](mailto:support@scrapingant.com) if you need more features provided by ScrapingAnt, but not yet implemented in this integration.\n",
+    "\n",
+    "### Integration details\n",
+    "\n",
+    "| Class                                                                                                                                                    | Package                                                                                        | Local | Serializable | JS support |\n",
+    "|:---------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------|:-----:|:------------:|:----------:|\n",
+    "| [ScrapingAntLoader](https://api.python.langchain.com/en/latest/document_loaders/langchain_community.document_loaders.scrapingant.ScrapingAntLoader.html) | [langchain_community](https://api.python.langchain.com/en/latest/community_api_reference.html) |   ❌   |      ❌       |     ❌      | \n",
+    "\n",
+    "### Loader features\n",
+    "|      Source       | Document Lazy Loading | Async Support |\n",
+    "|:-----------------:|:---------------------:|:-------------:| \n",
+    "| ScrapingAntLoader |           ✅           |       ❌       | \n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Setup\n",
+    "\n",
+    "Install ScrapingAnt Python SDK and he required Langchain packages using pip:\n",
+    "```shell\n",
+    "pip install scrapingant-client langchain langchain-community\n",
+    "```"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": "## Instantiation"
+  },
+  {
+   "cell_type": "code",
+   "metadata": {
+    "ExecuteTime": {
+     "end_time": "2024-07-22T18:18:50.903258Z",
+     "start_time": "2024-07-22T18:18:35.265390Z"
+    }
+   },
+   "source": [
+    "from langchain_community.document_loaders import ScrapingAntLoader\n",
+    "\n",
+    "scrapingant_loader = ScrapingAntLoader(\n",
+    "    [\"https://scrapingant.com/\", \"https://example.com/\"],  # List of URLs to scrape\n",
+    "    api_key=\"<YOUR_SCRAPINGANT_TOKEN>\",  # Get your API key from https://scrapingant.com/\n",
+    "    continue_on_failure=True,  # Ignore unprocessable web pages and log their exceptions\n",
+    ")"
+   ],
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "[Document(metadata={'url': 'https://scrapingant.com/'}, page_content=\"![](images/loader.svg)\\n\\n[![](images/ScrapingAnt-1.svg)](/) Features Pricing\\n\\nServices\\n\\n[Web Scraping API](/) [LLM-ready data extraction](/llm-ready-data-extraction)\\n[AI data scraping](/ai-data-scraper) [Residential Proxy](/residential-proxies)\\n\\n[Blog](https://scrapingant.com/blog/)\\n\\nDocumentatation\\n\\n[Web Scraping API](https://docs.scrapingant.com) [Residential\\nProxies](https://proxydocs.scrapingant.com)\\n\\nContact Us\\n\\n[Sign In](https://app.scrapingant.com/login)\\n\\n![](images/icon-menu.svg)\\n\\n![](images/Capterra-Rating.png)\\n\\n# Enterprise-Grade Scraping API.  \\nAnt Sized Pricing.\\n\\n## Get the mission-critical speed, reliability, and features you need at a\\nfraction of the cost!  \\n\\nGot Questions?  \\n(get expert advice)\\n\\n[ Try Our Free Plan (10,000 API Credits) ](https://app.scrapingant.com/signup)\\n\\n![](images/lines-10-white.svg)![](images/lines-12-white.svg)\\n\\n### Proudly scaling with us\\n\\n![](images/_2cd6c6d09d261d19_281d72aa098ecca8.png)![](images/_bb8ca9c8d001abd4_dc29a36ce27bdee8_1_bb8ca9c8d001abd4_dc29a36ce27bdee8.png)![](images/_d84700234b61df23_9abf58d176a2d7fc.png)![](images/_ca6d37170ae5cd25_fca779750afd17ef.png)![](images/Screenshot-2024-05-22-at-23.28.16.png)\\n\\n### Industry Leading Pricing\\n\\nFrom our generous 10,000 API credit free plan to our industry leading paid\\nplans, we strive to provide unbeatable bang for your buck. That's just what\\nants do!  \\n\\u200d\\n\\n![](images/industry-leading-prcing--compressed.webp)\\n\\nCost per 1,000 API Credits - Level 1 Plan\\n\\n### Unparalleled Value\\n\\nLow cost per API credit is great, but what’s even more important is how much\\ndata you can actually collect for each credit spent. Like any good Ant we\\nnever waste a crumb!\\n\\n![](images/unparalleled-value-compressed.webp)\\n\\nGoogle SERP API - Cost per 1,000 Requests – Level 1 Plan\\n\\n![](images/Doodle-4-White.svg)![](images/Doodle-Left-1-White.svg)\\n\\n## Ultimate Black Box Scraping Solution\\n\\n### Unlimited Concurrency  \\n\\u200d\\n\\nWith unlimited parallel requests easily gather LARGE volumes of data from\\nmultiple locations in record time. Available on ALL plan levels.  \\n\\u200d\\n\\n### Lightning Fast Scraping WITHOUT Getting Blocked\\n\\nOur proprietary algo seamlessly switches to the exact right proxy for almost\\nany situation, saving you and your dev team countless hours of frustration.  \\n\\u200d\\n\\n#### What's inside?\\n\\n  * Chrome Page Rendering\\n\\n  * Low Latency Rotating Proxies  \\n\\n  * Javascript Execution\\n\\n  * Custom Cookies\\n\\n  * Fastest AWS & Hetzner Servers\\n\\n  * Unlimited Parallel Requests\\n\\n  * Headless Browsers  \\n\\n  * Residential Proxies\\n\\n  * Supports All Programming Languages & Proxy\\n\\n  * CAPTCHA Avoidance\\n\\n[ Try Our Free Plan (10,000 API Credits) ](https://app.scrapingant.com/signup)\\n\\n![](images/Doodle-3-White.svg)\\n\\n###### Metrics\\n\\n## The most reliable web scraping API\\n\\nOur clients have saved up to 40% of data collection budgets by integrating\\nScrapingAnt API instead of self-made solutions development.\\n\\n99.99%\\n\\nUptime over the last year.\\n\\n85.5%\\n\\nAnti-scraping avoidance rate with our custom cloud browser solution\\n\\n![](images/icon-gallery-dark.svg)\\n\\n### Unlimited parallel requests\\n\\n![](images/icon-id-dark.svg)\\n\\n### 3+ million proxy servers across the world\\n\\n![](images/icon-switcher-white.svg)\\n\\n### Open your web page as in a real browser\\n\\n![](images/Doodle-9-Dark.svg)\\n\\nSimple API integration\\n\\n1\\n\\n### Choose your plan\\n\\nWe offer subscription plans, or you can always request custom pricing.  \\n **Free for personal use!**\\n\\n2\\n\\n### Test the API\\n\\nScrape your target website with our **UI request executor** or generate\\nscraping code for your preferred language.\\n\\n3\\n\\n### Scrape the Web\\n\\nBuild your data extraction pipeline using our **API** and forget about **rate\\nlimits** and **blocks**.\\n\\n![](images/Doodle-4-White.svg)![](images/Doodle-Left-1-White.svg)\\n\\n###### Pricing\\n\\n## Industry leading pricing that scales with your business.\\n\\n### Enthusiast\\n\\n#### 100.000 API credits\\n\\n$19\\n\\n/mo\\n\\nIdeal for freelancers or students.\\n\\n[ Get Started ](https://app.scrapingant.com/signup)\\n\\n![](images/check-small.svg)\\n\\nEmail support\\n\\n![](images/check-small.svg)\\n\\nDocumentation-only integration\\n\\n### Startup\\n\\n#### 500.000 API credits\\n\\n$49\\n\\n/mo\\n\\nFor small to medium sized teams looking to grow.  \\n  \\nPopular choice!\\n\\n[ Get Started ](https://app.scrapingant.com/signup)\\n\\n![](images/check-small.svg)\\n\\nPriority email support\\n\\n![](images/check-small.svg)\\n\\nExpert assistance\\n\\n![](images/check-small.svg)\\n\\nIntegration with custom code snippets\\n\\n### Business\\n\\n#### 3.000.000 API credits\\n\\n$249\\n\\n/mo\\n\\nFor larger teams and companies.\\n\\n[ Get Started ](https://app.scrapingant.com/signup)\\n\\n![](images/check-small.svg)\\n\\nPriority email support\\n\\n![](images/check-small.svg)\\n\\nLive integration calls\\n\\n![](images/check-small.svg)\\n\\nExpert guidance and integration planning\\n\\n![](images/check-small.svg)\\n\\nCustom proxy pools\\n\\n![](images/check-small.svg)\\n\\nCustom avoidances\\n\\n![](images/check-small.svg)\\n\\nDedicated manager\\n\\n### Business Pro\\n\\n#### 8.000.000 API credits\\n\\n$599\\n\\n/mo\\n\\nExtended volume Business plan.\\n\\n[ Get Started ](https://app.scrapingant.com/signup)\\n\\n![](images/check-small.svg)\\n\\nPriority email support\\n\\n![](images/check-small.svg)\\n\\nLive integration calls\\n\\n![](images/check-small.svg)\\n\\nExpert guidance and integration planning\\n\\n![](images/check-small.svg)\\n\\nCustom proxy pools\\n\\n![](images/check-small.svg)\\n\\nCustom avoidances\\n\\n![](images/check-small.svg)\\n\\nDedicated manager\\n\\n### Custom Plan\\n\\n#### 10M+ API credits\\n\\n$699+\\n\\n/mo\\n\\nExplore custom deals and services we could provide for Enterprise level\\ncustomers.\\n\\n[ Contact us ](https://app.scrapingant.com/signup)\\n\\n![](images/check-small.svg)\\n\\nFully customisable solution\\n\\n![](images/check-small.svg)\\n\\nResidential Proxy special prices\\n\\n![](images/check-small.svg)\\n\\nSLA\\n\\n[![](images/Capterra-\\nRating.png)](https://www.capterra.com/p/214735/ScrapingAnt/reviews/)\\n\\n★ ★ ★ ★ ★\\n\\n![](images/5521ce5758e089d7d7f5d226a2e995c3.jpg)\\n\\n#### “Onboarding and API integration was smooth and clear. Everything works\\ngreat. The support was excellent. **Overall a great scraper**.”\\n\\nIllia K., Android Software Developer\\n\\n★ ★ ★ ★ ★\\n\\n![](images/e57164aafb18d9a888776c96cf159368.jpg)\\n\\n#### “Great communication with co-founders helped me to get the job done.\\nGreat proxy diversity and good price.”\\n\\nAndrii M., Senior Software Engineer\\n\\n★ ★ ★ ★ ★\\n\\n![](images/Dmytro-T..jpg)\\n\\n#### “This product helps me to scale and extend my business. The API is easy\\nto integrate and support is really good.”\\n\\nDmytro T., Senior Software Engineer\\n\\n![](images/Doodle-7-Dark.svg)![](images/Doodle-8-Dark.svg)\\n\\n#### Frequently asked questions.\\n\\nIf you have any further questions, [Get in\\ntouch](https://scrapingant.com/#contact) with our friendly team\\n\\n##### What is ScrapingAnt?\\n\\n![](images/icon-arrow-right.svg)\\n\\nScrapingAnt is a service that helps you to solve scraping tasks of any\\ncomplexity. With using of millions proxies around the World and a whole\\nheadless browser cluster we can provide you the best web harvesting and\\nscraping experience.  \\n  \\nScrapingAnt also provides a custom software development service. Data\\nharvesting, data storage or data querying - we can provide you the best and\\naffordable custom solution that fits all your needs.\\n\\n##### **What is an API Credit?**\\n\\n![](images/icon-arrow-right.svg)\\n\\nEach subscription plan contains a particular amount of API credits per month.\\nDepending on the parameters you configures your API calls it will cost you\\nfrom one to several credits. By default, each request costs 10 API credits\\nbecause JavaScript rendering and Standard proxies are enabled. [Learn more\\nabout requests costs](https://docs.scrapingant.com/api-credits-usage).\\n\\n##### I'm not a developer, can you create custom scraping solutions for me?\\n\\n![](images/icon-arrow-right.svg)\\n\\nYes of course! We regularly create custom scraping scripts and projects for\\nour clients. We are also partnering with several custom software development\\ncompanies, so we won't never be out of resources to help with a scraping\\nproject of any size. Just [Contact Us](https://scrapingant.com/#contact) and\\ndescribe your needs.\\n\\n##### Do I need a credit cart to start the free trial?\\n\\n![](images/icon-arrow-right.svg)\\n\\nScrapingAnt provides a completely free subscription plan which contains 10.000\\nAPI credits that can be consumed during month. Until you will need more - it\\nis completely free and doesn't require a credit card.\\n\\n### “Our clients are pleasantly surprised by the response speed of our team.”\\n\\n![](images/oleg-cartoon-image.jpg)\\n\\nOleg Kulyk,  \\nScrapingAnt Founder\\n\\n* Our team will contact you ASAP.\\n\\nThank you! Your submission has been received!\\n\\nOops! Something went wrong while submitting the form.\\n\\n![](images/illustration-speed-lines-white.svg)\\n\\n## Grow your business with us\\n\\n[ Try Our Free Plan! ](https://app.scrapingant.com/signup)\\n\\n[\\n\\n## Features\\n\\n](https://scrapingant.com/#features) [\\n\\n## Pricing\\n\\n](https://scrapingant.com/#pricing) [\\n\\n## Blog\\n\\n](https://scrapingant.com/blog/) [\\n\\n## Documentation\\n\\n](https://docs.scrapingant.com/) [\\n\\n## Web Scraping API\\n\\n](https://scrapingant.com) [\\n\\n## LLM-ready web data\\n\\n](llm-ready-data-extraction.html) [\\n\\n## Residential Proxy\\n\\n](residential-proxies.html) [\\n\\n## Custom Scraper Development\\n\\n](https://scrapingant.com/custom-scraping-solution) [\\n\\n## Affiliate program\\n\\n](https://scrapingant.com/legal/affiliate/) [\\n\\n## Free proxies\\n\\n](https://scrapingant.com/free-proxies/)\\n\\n###### Web Scraping 101  \\n\\n[What is Web Scraping?](https://docs.scrapingant.com/web-scraping-101/what-is-\\nweb-scraping) [**Is Web Scraping Legal?**](https://scrapingant.com/blog/is-\\nweb-scraping-legal) [**10 Main Proxy\\nTypes**](https://scrapingant.com/blog/main-proxy-types) [Datacenter vs\\nResidential Proxies](https://scrapingant.com/blog/residential-vs-datacenter-\\nproxy-webscraping) [Best Proxy Scraping\\nTools](https://scrapingant.com/blog/top-open-source-proxy-scrapers)\\n[**Overcoming scraping challenges with Web Scraping\\nAPI**](https://scrapingant.com/blog/data-scraping-challenges) [IP rate-\\nlimiting avoidance](https://scrapingant.com/blog/avoid-ip-rate-limiting)\\n[Rotating proxies with Puppeteer](https://scrapingant.com/blog/how-to-use-\\nrotating-proxies-with-puppeteer) [Scraping Dynamic Website with\\nPython](https://scrapingant.com/blog/scrape-dynamic-website-with-python) [Web\\nScraping with Python](https://scrapingant.com/blog/top-5-popular-python-\\nlibraries-for-web-scraping-in-2020) [Web Scraping with\\nJava](https://scrapingant.com/blog/web-scraping-java) [Web Scraping with\\nNodeJS](https://scrapingant.com/blog/web-scraping-javascript) [Web Scraping\\nwith Deno](https://scrapingant.com/blog/deno-web-scraping) [**Web Scraping\\nwith R**](https://scrapingant.com/blog/r-web-scraping) [**Web Scraping with\\nPHP**](https://scrapingant.com/blog/web-scraping-php) [**Web Scraping with\\nGo**](https://scrapingant.com/blog/web-scraping-go)\\n\\n###### Use Cases  \\n\\n[**Real estate decisions with Booking.com\\nscraping**](https://scrapingant.com/blog/booking-data-scraping) [**Sneaker\\nPrice Data Collection with Web Scraping\\nAPI**](https://scrapingant.com/blog/sneakers-scraping-api) [**Best Web\\nScraping APIs For Freelancers**](https://scrapingant.com/blog/best-web-\\nscraping-api-freelance) [**Smart NFT Decisions with Data\\nCollection**](https://scrapingant.com/blog/nft-data-collection) [**How Data\\nCollection Can Improve HR Processes**](https://scrapingant.com/blog/data-\\ncollection-for-hr-processes) [**Rule eCommerce with Data\\nCollection**](https://scrapingant.com/blog/data-collection-for-ecommerce)\\n[**How companies use Web Scraping to gain a Competitive\\nEdge**](https://scrapingant.com/blog/how-companies-use-web-scraping)\\n[**Benefits of Web Scraping for\\nHospitality**](https://scrapingant.com/blog/web-scraping-for-hospitality)\\n[**Uses of Web Scraping for Price\\nMonitoring**](https://scrapingant.com/blog/web-scraping-for-price-monitoring)\\n[**Benefits of Web Scraping for Real\\nEstate**](https://scrapingant.com/blog/web-scraping-for-real-estate) [**Web\\nScraping for Data Scientists**](https://scrapingant.com/blog/web-scraping-for-\\ndata-scientists) [**How to Collect Data from\\nTikTok**](https://scrapingant.com/blog/web-scraping-for-price-monitoring)\\n\\n###### Legal  \\n\\n[Terms of Use](https://scrapingant.com/legal/terms-of-use) [Privacy\\nPolicy](https://scrapingant.com/legal/privacy-policy) [Cookies\\nPolicy](https://scrapingant.com/legal/cookies-policy)\\n\\n###### External Links  \\n\\n[Github](https://github.com/ScrapingAnt)\\n[Linkedin](https://linkedin.com/company/scrapingant)\\n[Facebook](https://www.facebook.com/scrapingant)\\n[Twitter](https://twitter.com/ScrapingAnt)\\n\\n[![](images/ScrapingAnt-2.svg)](https://scrapingant.com)\\n\\n© Copyright ScrapingAnt  \\nPowered by [DATAANT](https://scrapingant.com)\\n\\n![](images/lines-13-white.svg)\\n\\nBy browsing this site, you agree to our [Cookies\\nPolicy](https://scrapingant.com/legal/cookies-policy)\\n\\n![](images/icon-x_1.svg)\\n\\n\"), Document(metadata={'url': 'https://example.com/'}, page_content='# Example Domain\\n\\nThis domain is for use in illustrative examples in documents. You may use this\\ndomain in literature without prior coordination or asking for permission.\\n\\n[More information...](https://www.iana.org/domains/example)\\n\\n')]\n"
+     ]
+    }
+   ],
+   "execution_count": 6
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": "The ScrapingAntLoader also allows providing a dict - scraping config for customizing the scrape request. As it is based on the [ScrapingAnt Python SDK](https://github.com/ScrapingAnt/scrapingant-client-python) you can pass any of the [common arguments](https://github.com/ScrapingAnt/scrapingant-client-python) to the `scrape_config` parameter."
+  },
+  {
+   "cell_type": "code",
+   "metadata": {
+    "ExecuteTime": {
+     "end_time": "2024-07-21T22:02:30.701905Z",
+     "start_time": "2024-07-21T22:02:29.036115Z"
+    }
+   },
+   "source": [
+    "from langchain_community.document_loaders import ScrapingAntLoader\n",
+    "\n",
+    "scrapingant_config = {\n",
+    "    \"browser\": True,  # Enable browser rendering with a cloud browser\n",
+    "    \"proxy_type\": \"datacenter\",  # Select a proxy type (datacenter or residential)\n",
+    "    \"proxy_country\": \"us\",  # Select a proxy location\n",
+    "}\n",
+    "\n",
+    "scrapingant_additional_config_loader = ScrapingAntLoader(\n",
+    "    [\"https://scrapingant.com/\"],\n",
+    "    api_key=\"<YOUR_SCRAPINGANT_TOKEN>\",  # Get your API key from https://scrapingant.com/\n",
+    "    continue_on_failure=True,  # Ignore unprocessable web pages and log their exceptions\n",
+    "    scrape_config=scrapingant_config,  # Pass the scrape_config object\n",
+    ")"
+   ],
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "[Document(metadata={'url': 'https://scrapingant.com/'}, page_content=\"![](images/loader.svg)\\n\\n[![](images/ScrapingAnt-1.svg)](/) Features Pricing\\n\\nServices\\n\\n[Web Scraping API](/) [LLM-ready data extraction](/llm-ready-data-extraction)\\n[AI data scraping](/ai-data-scraper) [Residential Proxy](/residential-proxies)\\n\\n[Blog](https://scrapingant.com/blog/)\\n\\nDocumentatation\\n\\n[Web Scraping API](https://docs.scrapingant.com) [Residential\\nProxies](https://proxydocs.scrapingant.com)\\n\\nContact Us\\n\\n[Sign In](https://app.scrapingant.com/login)\\n\\n![](images/icon-menu.svg)\\n\\n![](images/Capterra-Rating.png)\\n\\n# Enterprise-Grade Scraping API.  \\nAnt Sized Pricing.\\n\\n## Get the mission-critical speed, reliability, and features you need at a\\nfraction of the cost!  \\n\\nGot Questions?  \\n(get expert advice)\\n\\n[ Try Our Free Plan (10,000 API Credits) ](https://app.scrapingant.com/signup)\\n\\n![](images/lines-10-white.svg)![](images/lines-12-white.svg)\\n\\n### Proudly scaling with us\\n\\n![](images/_2cd6c6d09d261d19_281d72aa098ecca8.png)![](images/_bb8ca9c8d001abd4_dc29a36ce27bdee8_1_bb8ca9c8d001abd4_dc29a36ce27bdee8.png)![](images/_d84700234b61df23_9abf58d176a2d7fc.png)![](images/_ca6d37170ae5cd25_fca779750afd17ef.png)![](images/Screenshot-2024-05-22-at-23.28.16.png)\\n\\n### Industry Leading Pricing\\n\\nFrom our generous 10,000 API credit free plan to our industry leading paid\\nplans, we strive to provide unbeatable bang for your buck. That's just what\\nants do!  \\n\\u200d\\n\\n![](images/industry-leading-prcing--compressed.webp)\\n\\nCost per 1,000 API Credits - Level 1 Plan\\n\\n### Unparalleled Value\\n\\nLow cost per API credit is great, but what’s even more important is how much\\ndata you can actually collect for each credit spent. Like any good Ant we\\nnever waste a crumb!\\n\\n![](images/unparalleled-value-compressed.webp)\\n\\nGoogle SERP API - Cost per 1,000 Requests – Level 1 Plan\\n\\n![](images/Doodle-4-White.svg)![](images/Doodle-Left-1-White.svg)\\n\\n## Ultimate Black Box Scraping Solution\\n\\n### Unlimited Concurrency  \\n\\u200d\\n\\nWith unlimited parallel requests easily gather LARGE volumes of data from\\nmultiple locations in record time. Available on ALL plan levels.  \\n\\u200d\\n\\n### Lightning Fast Scraping WITHOUT Getting Blocked\\n\\nOur proprietary algo seamlessly switches to the exact right proxy for almost\\nany situation, saving you and your dev team countless hours of frustration.  \\n\\u200d\\n\\n#### What's inside?\\n\\n  * Chrome Page Rendering\\n\\n  * Low Latency Rotating Proxies  \\n\\n  * Javascript Execution\\n\\n  * Custom Cookies\\n\\n  * Fastest AWS & Hetzner Servers\\n\\n  * Unlimited Parallel Requests\\n\\n  * Headless Browsers  \\n\\n  * Residential Proxies\\n\\n  * Supports All Programming Languages & Proxy\\n\\n  * CAPTCHA Avoidance\\n\\n[ Try Our Free Plan (10,000 API Credits) ](https://app.scrapingant.com/signup)\\n\\n![](images/Doodle-3-White.svg)\\n\\n###### Metrics\\n\\n## The most reliable web scraping API\\n\\nOur clients have saved up to 40% of data collection budgets by integrating\\nScrapingAnt API instead of self-made solutions development.\\n\\n99.99%\\n\\nUptime over the last year.\\n\\n85.5%\\n\\nAnti-scraping avoidance rate with our custom cloud browser solution\\n\\n![](images/icon-gallery-dark.svg)\\n\\n### Unlimited parallel requests\\n\\n![](images/icon-id-dark.svg)\\n\\n### 3+ million proxy servers across the world\\n\\n![](images/icon-switcher-white.svg)\\n\\n### Open your web page as in a real browser\\n\\n![](images/Doodle-9-Dark.svg)\\n\\nSimple API integration\\n\\n1\\n\\n### Choose your plan\\n\\nWe offer subscription plans, or you can always request custom pricing.  \\n **Free for personal use!**\\n\\n2\\n\\n### Test the API\\n\\nScrape your target website with our **UI request executor** or generate\\nscraping code for your preferred language.\\n\\n3\\n\\n### Scrape the Web\\n\\nBuild your data extraction pipeline using our **API** and forget about **rate\\nlimits** and **blocks**.\\n\\n![](images/Doodle-4-White.svg)![](images/Doodle-Left-1-White.svg)\\n\\n###### Pricing\\n\\n## Industry leading pricing that scales with your business.\\n\\n### Enthusiast\\n\\n#### 100.000 API credits\\n\\n$19\\n\\n/mo\\n\\nIdeal for freelancers or students.\\n\\n[ Get Started ](https://app.scrapingant.com/signup)\\n\\n![](images/check-small.svg)\\n\\nEmail support\\n\\n![](images/check-small.svg)\\n\\nDocumentation-only integration\\n\\n### Startup\\n\\n#### 500.000 API credits\\n\\n$49\\n\\n/mo\\n\\nFor small to medium sized teams looking to grow.  \\n  \\nPopular choice!\\n\\n[ Get Started ](https://app.scrapingant.com/signup)\\n\\n![](images/check-small.svg)\\n\\nPriority email support\\n\\n![](images/check-small.svg)\\n\\nExpert assistance\\n\\n![](images/check-small.svg)\\n\\nIntegration with custom code snippets\\n\\n### Business\\n\\n#### 3.000.000 API credits\\n\\n$249\\n\\n/mo\\n\\nFor larger teams and companies.\\n\\n[ Get Started ](https://app.scrapingant.com/signup)\\n\\n![](images/check-small.svg)\\n\\nPriority email support\\n\\n![](images/check-small.svg)\\n\\nLive integration calls\\n\\n![](images/check-small.svg)\\n\\nExpert guidance and integration planning\\n\\n![](images/check-small.svg)\\n\\nCustom proxy pools\\n\\n![](images/check-small.svg)\\n\\nCustom avoidances\\n\\n![](images/check-small.svg)\\n\\nDedicated manager\\n\\n### Business Pro\\n\\n#### 8.000.000 API credits\\n\\n$599\\n\\n/mo\\n\\nExtended volume Business plan.\\n\\n[ Get Started ](https://app.scrapingant.com/signup)\\n\\n![](images/check-small.svg)\\n\\nPriority email support\\n\\n![](images/check-small.svg)\\n\\nLive integration calls\\n\\n![](images/check-small.svg)\\n\\nExpert guidance and integration planning\\n\\n![](images/check-small.svg)\\n\\nCustom proxy pools\\n\\n![](images/check-small.svg)\\n\\nCustom avoidances\\n\\n![](images/check-small.svg)\\n\\nDedicated manager\\n\\n### Custom Plan\\n\\n#### 10M+ API credits\\n\\n$699+\\n\\n/mo\\n\\nExplore custom deals and services we could provide for Enterprise level\\ncustomers.\\n\\n[ Contact us ](https://app.scrapingant.com/signup)\\n\\n![](images/check-small.svg)\\n\\nFully customisable solution\\n\\n![](images/check-small.svg)\\n\\nResidential Proxy special prices\\n\\n![](images/check-small.svg)\\n\\nSLA\\n\\n[![](images/Capterra-\\nRating.png)](https://www.capterra.com/p/214735/ScrapingAnt/reviews/)\\n\\n★ ★ ★ ★ ★\\n\\n![](images/5521ce5758e089d7d7f5d226a2e995c3.jpg)\\n\\n#### “Onboarding and API integration was smooth and clear. Everything works\\ngreat. The support was excellent. **Overall a great scraper**.”\\n\\nIllia K., Android Software Developer\\n\\n★ ★ ★ ★ ★\\n\\n![](images/e57164aafb18d9a888776c96cf159368.jpg)\\n\\n#### “Great communication with co-founders helped me to get the job done.\\nGreat proxy diversity and good price.”\\n\\nAndrii M., Senior Software Engineer\\n\\n★ ★ ★ ★ ★\\n\\n![](images/Dmytro-T..jpg)\\n\\n#### “This product helps me to scale and extend my business. The API is easy\\nto integrate and support is really good.”\\n\\nDmytro T., Senior Software Engineer\\n\\n![](images/Doodle-7-Dark.svg)![](images/Doodle-8-Dark.svg)\\n\\n#### Frequently asked questions.\\n\\nIf you have any further questions, [Get in\\ntouch](https://scrapingant.com/#contact) with our friendly team\\n\\n##### What is ScrapingAnt?\\n\\n![](images/icon-arrow-right.svg)\\n\\nScrapingAnt is a service that helps you to solve scraping tasks of any\\ncomplexity. With using of millions proxies around the World and a whole\\nheadless browser cluster we can provide you the best web harvesting and\\nscraping experience.  \\n  \\nScrapingAnt also provides a custom software development service. Data\\nharvesting, data storage or data querying - we can provide you the best and\\naffordable custom solution that fits all your needs.\\n\\n##### **What is an API Credit?**\\n\\n![](images/icon-arrow-right.svg)\\n\\nEach subscription plan contains a particular amount of API credits per month.\\nDepending on the parameters you configures your API calls it will cost you\\nfrom one to several credits. By default, each request costs 10 API credits\\nbecause JavaScript rendering and Standard proxies are enabled. [Learn more\\nabout requests costs](https://docs.scrapingant.com/api-credits-usage).\\n\\n##### I'm not a developer, can you create custom scraping solutions for me?\\n\\n![](images/icon-arrow-right.svg)\\n\\nYes of course! We regularly create custom scraping scripts and projects for\\nour clients. We are also partnering with several custom software development\\ncompanies, so we won't never be out of resources to help with a scraping\\nproject of any size. Just [Contact Us](https://scrapingant.com/#contact) and\\ndescribe your needs.\\n\\n##### Do I need a credit cart to start the free trial?\\n\\n![](images/icon-arrow-right.svg)\\n\\nScrapingAnt provides a completely free subscription plan which contains 10.000\\nAPI credits that can be consumed during month. Until you will need more - it\\nis completely free and doesn't require a credit card.\\n\\n### “Our clients are pleasantly surprised by the response speed of our team.”\\n\\n![](images/oleg-cartoon-image.jpg)\\n\\nOleg Kulyk,  \\nScrapingAnt Founder\\n\\n* Our team will contact you ASAP.\\n\\nThank you! Your submission has been received!\\n\\nOops! Something went wrong while submitting the form.\\n\\n![](images/illustration-speed-lines-white.svg)\\n\\n## Grow your business with us\\n\\n[ Try Our Free Plan! ](https://app.scrapingant.com/signup)\\n\\n[\\n\\n## Features\\n\\n](https://scrapingant.com/#features) [\\n\\n## Pricing\\n\\n](https://scrapingant.com/#pricing) [\\n\\n## Blog\\n\\n](https://scrapingant.com/blog/) [\\n\\n## Documentation\\n\\n](https://docs.scrapingant.com/) [\\n\\n## Web Scraping API\\n\\n](https://scrapingant.com) [\\n\\n## LLM-ready web data\\n\\n](llm-ready-data-extraction.html) [\\n\\n## Residential Proxy\\n\\n](residential-proxies.html) [\\n\\n## Custom Scraper Development\\n\\n](https://scrapingant.com/custom-scraping-solution) [\\n\\n## Affiliate program\\n\\n](https://scrapingant.com/legal/affiliate/) [\\n\\n## Free proxies\\n\\n](https://scrapingant.com/free-proxies/)\\n\\n###### Web Scraping 101  \\n\\n[What is Web Scraping?](https://docs.scrapingant.com/web-scraping-101/what-is-\\nweb-scraping) [**Is Web Scraping Legal?**](https://scrapingant.com/blog/is-\\nweb-scraping-legal) [**10 Main Proxy\\nTypes**](https://scrapingant.com/blog/main-proxy-types) [Datacenter vs\\nResidential Proxies](https://scrapingant.com/blog/residential-vs-datacenter-\\nproxy-webscraping) [Best Proxy Scraping\\nTools](https://scrapingant.com/blog/top-open-source-proxy-scrapers)\\n[**Overcoming scraping challenges with Web Scraping\\nAPI**](https://scrapingant.com/blog/data-scraping-challenges) [IP rate-\\nlimiting avoidance](https://scrapingant.com/blog/avoid-ip-rate-limiting)\\n[Rotating proxies with Puppeteer](https://scrapingant.com/blog/how-to-use-\\nrotating-proxies-with-puppeteer) [Scraping Dynamic Website with\\nPython](https://scrapingant.com/blog/scrape-dynamic-website-with-python) [Web\\nScraping with Python](https://scrapingant.com/blog/top-5-popular-python-\\nlibraries-for-web-scraping-in-2020) [Web Scraping with\\nJava](https://scrapingant.com/blog/web-scraping-java) [Web Scraping with\\nNodeJS](https://scrapingant.com/blog/web-scraping-javascript) [Web Scraping\\nwith Deno](https://scrapingant.com/blog/deno-web-scraping) [**Web Scraping\\nwith R**](https://scrapingant.com/blog/r-web-scraping) [**Web Scraping with\\nPHP**](https://scrapingant.com/blog/web-scraping-php) [**Web Scraping with\\nGo**](https://scrapingant.com/blog/web-scraping-go)\\n\\n###### Use Cases  \\n\\n[**Real estate decisions with Booking.com\\nscraping**](https://scrapingant.com/blog/booking-data-scraping) [**Sneaker\\nPrice Data Collection with Web Scraping\\nAPI**](https://scrapingant.com/blog/sneakers-scraping-api) [**Best Web\\nScraping APIs For Freelancers**](https://scrapingant.com/blog/best-web-\\nscraping-api-freelance) [**Smart NFT Decisions with Data\\nCollection**](https://scrapingant.com/blog/nft-data-collection) [**How Data\\nCollection Can Improve HR Processes**](https://scrapingant.com/blog/data-\\ncollection-for-hr-processes) [**Rule eCommerce with Data\\nCollection**](https://scrapingant.com/blog/data-collection-for-ecommerce)\\n[**How companies use Web Scraping to gain a Competitive\\nEdge**](https://scrapingant.com/blog/how-companies-use-web-scraping)\\n[**Benefits of Web Scraping for\\nHospitality**](https://scrapingant.com/blog/web-scraping-for-hospitality)\\n[**Uses of Web Scraping for Price\\nMonitoring**](https://scrapingant.com/blog/web-scraping-for-price-monitoring)\\n[**Benefits of Web Scraping for Real\\nEstate**](https://scrapingant.com/blog/web-scraping-for-real-estate) [**Web\\nScraping for Data Scientists**](https://scrapingant.com/blog/web-scraping-for-\\ndata-scientists) [**How to Collect Data from\\nTikTok**](https://scrapingant.com/blog/web-scraping-for-price-monitoring)\\n\\n###### Legal  \\n\\n[Terms of Use](https://scrapingant.com/legal/terms-of-use) [Privacy\\nPolicy](https://scrapingant.com/legal/privacy-policy) [Cookies\\nPolicy](https://scrapingant.com/legal/cookies-policy)\\n\\n###### External Links  \\n\\n[Github](https://github.com/ScrapingAnt)\\n[Linkedin](https://linkedin.com/company/scrapingant)\\n[Facebook](https://www.facebook.com/scrapingant)\\n[Twitter](https://twitter.com/ScrapingAnt)\\n\\n[![](images/ScrapingAnt-2.svg)](https://scrapingant.com)\\n\\n© Copyright ScrapingAnt  \\nPowered by [DATAANT](https://scrapingant.com)\\n\\n![](images/lines-13-white.svg)\\n\\nBy browsing this site, you agree to our [Cookies\\nPolicy](https://scrapingant.com/legal/cookies-policy)\\n\\n![](images/icon-x_1.svg)\\n\\n\")]\n"
+     ]
+    }
+   ],
+   "execution_count": 5
+  },
+  {
+   "metadata": {},
+   "cell_type": "markdown",
+   "source": [
+    "## Load\n",
+    "\n",
+    "Use the `load` method to scrape the web pages and get the extracted markdown content.\n"
+   ]
+  },
+  {
+   "metadata": {},
+   "cell_type": "code",
+   "outputs": [],
+   "execution_count": null,
+   "source": [
+    "# Load documents from URLs as markdown\n",
+    "documents = scrapingant_loader.load()\n",
+    "\n",
+    "print(documents)"
+   ]
+  },
+  {
+   "metadata": {},
+   "cell_type": "markdown",
+   "source": [
+    "## Lazy Load\n",
+    "\n",
+    "Use the 'lazy_load' method to scrape the web pages and get the extracted markdown content lazily."
+   ]
+  },
+  {
+   "metadata": {},
+   "cell_type": "code",
+   "outputs": [],
+   "execution_count": null,
+   "source": [
+    "# Lazy load documents from URLs as markdown\n",
+    "lazy_documents = scrapingant_loader.lazy_load()\n",
+    "\n",
+    "for document in lazy_documents:\n",
+    "    print(document)"
+   ]
+  },
+  {
+   "metadata": {},
+   "cell_type": "markdown",
+   "source": [
+    "## API reference\n",
+    "\n",
+    "This loader is based on the [ScrapingAnt Python SDK](https://docs.scrapingant.com/python-client). For more configuration options, see the [common arguments](https://github.com/ScrapingAnt/scrapingant-client-python/tree/master?tab=readme-ov-file#common-arguments)"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.9.1"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}
diff --git a/docs/docs/integrations/memory/tidb_chat_message_history.ipynb b/docs/docs/integrations/memory/tidb_chat_message_history.ipynb
index 35a4d99fcf8bb..4eda9df5f9a69 100644
--- a/docs/docs/integrations/memory/tidb_chat_message_history.ipynb
+++ b/docs/docs/integrations/memory/tidb_chat_message_history.ipynb
@@ -6,7 +6,7 @@
    "source": [
     "# TiDB\n",
     "\n",
-    "> [TiDB Cloud](https://tidbcloud.com/), is a comprehensive Database-as-a-Service (DBaaS) solution, that provides dedicated and serverless options. TiDB Serverless is now integrating a built-in vector search into the MySQL landscape. With this enhancement, you can seamlessly develop AI applications using TiDB Serverless without the need for a new database or additional technical stacks. Be among the first to experience it by joining the waitlist for the private beta at https://tidb.cloud/ai.\n",
+    "> [TiDB Cloud](https://www.pingcap.com/tidb-serverless/), is a comprehensive Database-as-a-Service (DBaaS) solution, that provides dedicated and serverless options. TiDB Serverless is now integrating a built-in vector search into the MySQL landscape. With this enhancement, you can seamlessly develop AI applications using TiDB Serverless without the need for a new database or additional technical stacks. Create a free TiDB Serverless cluster and start using the vector search feature at https://pingcap.com/ai.\n",
     "\n",
     "This notebook introduces how to use TiDB to store chat message history. "
    ]
diff --git a/docs/docs/integrations/providers/cassandra.mdx b/docs/docs/integrations/providers/cassandra.mdx
index cbef4c693bc22..0a29c393e1c58 100644
--- a/docs/docs/integrations/providers/cassandra.mdx
+++ b/docs/docs/integrations/providers/cassandra.mdx
@@ -68,3 +68,18 @@ Learn more in the [example notebook](/docs/integrations/document_loaders/cassand
 
 > Apache Cassandra, Cassandra and Apache are either registered trademarks or trademarks of 
 > the [Apache Software Foundation](http://www.apache.org/) in the United States and/or other countries.
+
+## Toolkit
+
+The `Cassandra Database toolkit` enables AI engineers to efficiently integrate agents
+with Cassandra data.
+
+```python
+from langchain_community.agent_toolkits.cassandra_database.toolkit import (
+    CassandraDatabaseToolkit,
+)
+```
+
+Learn more in the [example notebook](/docs/integrations/toolkits/cassandra_database).
+
+
diff --git a/docs/docs/integrations/providers/sklearn.mdx b/docs/docs/integrations/providers/sklearn.mdx
index ebb93942afce7..a2d9e0554d706 100644
--- a/docs/docs/integrations/providers/sklearn.mdx
+++ b/docs/docs/integrations/providers/sklearn.mdx
@@ -20,3 +20,16 @@ from langchain_community.vectorstores import SKLearnVectorStore
 ```
 
 For a more detailed walkthrough of the SKLearnVectorStore wrapper, see [this notebook](/docs/integrations/vectorstores/sklearn).
+
+
+## Retriever
+
+`Support vector machines (SVMs)` are the supervised learning 
+methods used for classification, regression and outliers detection.
+
+See a [usage example](/docs/integrations/retrievers/svm).
+
+```python
+from langchain_community.retrievers import SVMRetriever
+```
+
diff --git a/docs/docs/integrations/providers/slack.mdx b/docs/docs/integrations/providers/slack.mdx
index d5d632dc74bb7..fe0790de71ab6 100644
--- a/docs/docs/integrations/providers/slack.mdx
+++ b/docs/docs/integrations/providers/slack.mdx
@@ -7,7 +7,6 @@
 There isn't any special setup for it.
 
 
-
 ## Document loader
 
 See a [usage example](/docs/integrations/document_loaders/slack).
@@ -16,6 +15,14 @@ See a [usage example](/docs/integrations/document_loaders/slack).
 from langchain_community.document_loaders import SlackDirectoryLoader
 ```
 
+## Toolkit
+
+See a [usage example](/docs/integrations/toolkits/slack).
+
+```python
+from langchain_community.agent_toolkits import SlackToolkit
+```
+
 ## Chat loader
 
 See a [usage example](/docs/integrations/chat_loaders/slack).
diff --git a/docs/docs/integrations/providers/snowflake.mdx b/docs/docs/integrations/providers/snowflake.mdx
index cea1755b8810f..c42c719758803 100644
--- a/docs/docs/integrations/providers/snowflake.mdx
+++ b/docs/docs/integrations/providers/snowflake.mdx
@@ -7,8 +7,8 @@ This page covers how to use the `Snowflake` ecosystem within `LangChain`.
 
 ## Embedding models
 
-Snowflake offers their open weight `arctic` line of embedding models for free
-on [Hugging Face](https://huggingface.co/Snowflake/snowflake-arctic-embed-l). 
+Snowflake offers their open-weight `arctic` line of embedding models for free
+on [Hugging Face](https://huggingface.co/Snowflake/snowflake-arctic-embed-m-v1.5). The most recent model, snowflake-arctic-embed-m-v1.5 feature [matryoshka embedding](https://arxiv.org/abs/2205.13147) which allows for effective vector truncation. 
 You can use these models via the 
 [HuggingFaceEmbeddings](/docs/integrations/text_embedding/huggingfacehub) connector:
 
@@ -19,7 +19,7 @@ pip install langchain-community sentence-transformers
 ```python
 from langchain_huggingface import HuggingFaceEmbeddings
 
-model = HuggingFaceEmbeddings(model_name="snowflake/arctic-embed-l")
+model = HuggingFaceEmbeddings(model_name="snowflake/arctic-embed-m-v1.5")
 ```
 
 ## Document loader
diff --git a/docs/docs/integrations/providers/tidb.mdx b/docs/docs/integrations/providers/tidb.mdx
index 132643b5b7e64..401b4300c48f7 100644
--- a/docs/docs/integrations/providers/tidb.mdx
+++ b/docs/docs/integrations/providers/tidb.mdx
@@ -1,10 +1,10 @@
 # TiDB
 
-> [TiDB Cloud](https://tidbcloud.com/), is a comprehensive Database-as-a-Service (DBaaS) solution, 
+> [TiDB Cloud](https://www.pingcap.com/tidb-serverless), is a comprehensive Database-as-a-Service (DBaaS) solution,
 > that provides dedicated and serverless options. `TiDB Serverless` is now integrating 
 > a built-in vector search into the MySQL landscape. With this enhancement, you can seamlessly 
 > develop AI applications using `TiDB Serverless` without the need for a new database or additional 
-> technical stacks. Be among the first to experience it by joining the [waitlist for the private beta](https://tidb.cloud/ai).
+> technical stacks. Create a free TiDB Serverless cluster and start using the vector search feature at https://pingcap.com/ai.
 
 
 ## Installation and Setup
diff --git a/docs/docs/integrations/toolkits/cassandra_database.ipynb b/docs/docs/integrations/toolkits/cassandra_database.ipynb
index 256952a9fe4ed..7d33d09471973 100644
--- a/docs/docs/integrations/toolkits/cassandra_database.ipynb
+++ b/docs/docs/integrations/toolkits/cassandra_database.ipynb
@@ -6,23 +6,28 @@
    "source": [
     "# Cassandra Database\n",
     "\n",
-    "Apache Cassandra® is a widely used database for storing transactional application data. The introduction of functions and tooling in Large Language Models has opened up some exciting use cases for existing data in Generative AI applications. The Cassandra Database toolkit enables AI engineers to efficiently integrate Agents with Cassandra data, offering the following features: \n",
-    " - Fast data access through optimized queries. Most queries should run in single-digit ms or less. \n",
-    " - Schema introspection to enhance LLM reasoning capabilities \n",
-    " - Compatibility with various Cassandra deployments, including Apache Cassandra®, DataStax Enterprise™, and DataStax Astra™ \n",
-    " - Currently, the toolkit is limited to SELECT queries and schema introspection operations. (Safety first)\n",
+    ">`Apache Cassandra®` is a widely used database for storing transactional application data. The introduction of functions and >tooling in Large Language Models has opened up some exciting use cases for existing data in Generative AI applications. \n",
+    "\n",
+    ">The `Cassandra Database` toolkit enables AI engineers to integrate agents with Cassandra data efficiently, offering \n",
+    ">the following features: \n",
+    "> - Fast data access through optimized queries. Most queries should run in single-digit ms or less.\n",
+    "> - Schema introspection to enhance LLM reasoning capabilities\n",
+    "> - Compatibility with various Cassandra deployments, including Apache Cassandra®, DataStax Enterprise™, and DataStax Astra™\n",
+    "> - Currently, the toolkit is limited to SELECT queries and schema introspection operations. (Safety first)\n",
+    "\n",
+    "For more information on creating a Cassandra DB agent see the [CQL agent cookbook](https://github.com/langchain-ai/langchain/blob/master/cookbook/cql_agent.ipynb)\n",
     "\n",
     "## Quick Start\n",
-    " - Install the cassio library\n",
+    " - Install the `cassio` library\n",
     " - Set environment variables for the Cassandra database you are connecting to\n",
-    " - Initialize CassandraDatabase\n",
-    " - Pass the tools to your agent with toolkit.get_tools()\n",
+    " - Initialize `CassandraDatabase`\n",
+    " - Pass the tools to your agent with `toolkit.get_tools()`\n",
     " - Sit back and watch it do all your work for you\n",
     "\n",
     "## Theory of Operation\n",
-    "Cassandra Query Language (CQL) is the primary *human-centric* way of interacting with a Cassandra database. While offering some flexibility when generating queries, it requires knowledge of Cassandra data modeling best practices. LLM function calling gives an agent the ability to reason and then choose a tool to satisfy the request. Agents using LLMs should reason using Cassandra-specific logic when choosing the appropriate toolkit or chain of toolkits. This reduces the randomness introduced when LLMs are forced to provide a top-down solution. Do you want an LLM to have complete unfettered access to your database? Yeah. Probably not. To accomplish this, we provide a prompt for use when constructing questions for the agent: \n",
     "\n",
-    "```json\n",
+    "`Cassandra Query Language (CQL)` is the primary *human-centric* way of interacting with a Cassandra database. While offering some flexibility when generating queries, it requires knowledge of Cassandra data modeling best practices. LLM function calling gives an agent the ability to reason and then choose a tool to satisfy the request. Agents using LLMs should reason using Cassandra-specific logic when choosing the appropriate toolkit or chain of toolkits. This reduces the randomness introduced when LLMs are forced to provide a top-down solution. Do you want an LLM to have complete unfettered access to your database? Yeah. Probably not. To accomplish this, we provide a prompt for use when constructing questions for the agent: \n",
+    "\n",
     "You are an Apache Cassandra expert query analysis bot with the following features \n",
     "and rules:\n",
     " - You will take a question from the end user about finding specific \n",
@@ -38,6 +43,7 @@
     "\n",
     "The following is an example of a query path in JSON format:\n",
     "\n",
+    "```json\n",
     " {\n",
     "  \"query_paths\": [\n",
     "    {\n",
@@ -448,13 +454,6 @@
     "\n",
     "print(response[\"output\"])"
    ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "For a deepdive on creating a Cassandra DB agent see the [CQL agent cookbook](https://github.com/langchain-ai/langchain/blob/master/cookbook/cql_agent.ipynb)"
-   ]
   }
  ],
  "metadata": {
@@ -473,7 +472,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.9.1"
+   "version": "3.10.12"
   }
  },
  "nbformat": 4,
diff --git a/docs/docs/integrations/vectorstores/tidb_vector.ipynb b/docs/docs/integrations/vectorstores/tidb_vector.ipynb
index ee40990ac59be..069daecca4b03 100644
--- a/docs/docs/integrations/vectorstores/tidb_vector.ipynb
+++ b/docs/docs/integrations/vectorstores/tidb_vector.ipynb
@@ -6,7 +6,7 @@
    "source": [
     "# TiDB Vector\n",
     "\n",
-    "> [TiDB Cloud](https://tidbcloud.com/), is a comprehensive Database-as-a-Service (DBaaS) solution, that provides dedicated and serverless options. TiDB Serverless is now integrating a built-in vector search into the MySQL landscape. With this enhancement, you can seamlessly develop AI applications using TiDB Serverless without the need for a new database or additional technical stacks. Be among the first to experience it by joining the waitlist for the private beta at https://tidb.cloud/ai.\n",
+    "> [TiDB Cloud](https://www.pingcap.com/tidb-serverless), is a comprehensive Database-as-a-Service (DBaaS) solution, that provides dedicated and serverless options. TiDB Serverless is now integrating a built-in vector search into the MySQL landscape. With this enhancement, you can seamlessly develop AI applications using TiDB Serverless without the need for a new database or additional technical stacks. Create a free TiDB Serverless cluster and start using the vector search feature at https://pingcap.com/ai.\n",
     "\n",
     "This notebook provides a detailed guide on utilizing the TiDB Vector functionality, showcasing its features and practical applications."
    ]
diff --git a/docs/docs/integrations/vectorstores/vdms.ipynb b/docs/docs/integrations/vectorstores/vdms.ipynb
index 6a4a76bcfc757..7828ecbce779c 100644
--- a/docs/docs/integrations/vectorstores/vdms.ipynb
+++ b/docs/docs/integrations/vectorstores/vdms.ipynb
@@ -12,7 +12,8 @@
     "VDMS supports:\n",
     "* K nearest neighbor search\n",
     "* Euclidean distance (L2) and inner product (IP)\n",
-    "* Libraries for indexing and computing distances: TileDBDense, TileDBSparse, FaissFlat (Default), FaissIVFFlat\n",
+    "* Libraries for indexing and computing distances: TileDBDense, TileDBSparse, FaissFlat (Default), FaissIVFFlat, Flinng\n",
+    "* Embeddings for text, images, and video\n",
     "* Vector and metadata searches\n",
     "\n",
     "VDMS has server and client components. To setup the server, see the [installation instructions](https://github.com/IntelLabs/vdms/blob/master/INSTALL.md) or use the [docker image](https://hub.docker.com/r/intellabs/vdms).\n",
@@ -40,7 +41,7 @@
    ],
    "source": [
     "# Pip install necessary package\n",
-    "%pip install --upgrade --quiet pip sentence-transformers vdms \"unstructured-inference==0.6.6\";"
+    "%pip install --upgrade --quiet pip vdms sentence-transformers langchain-huggingface > /dev/null"
    ]
   },
   {
@@ -62,7 +63,7 @@
      "name": "stdout",
      "output_type": "stream",
      "text": [
-      "e6061b270eef87de5319a6c5af709b36badcad8118069a8f6b577d2e01ad5e2d\n"
+      "b26917ffac236673ef1d035ab9c91fe999e29c9eb24aa6c7103d7baa6bf2f72d\n"
      ]
     }
    ],
@@ -92,6 +93,9 @@
    "outputs": [],
    "source": [
     "import time\n",
+    "import warnings\n",
+    "\n",
+    "warnings.filterwarnings(\"ignore\")\n",
     "\n",
     "from langchain_community.document_loaders.text import TextLoader\n",
     "from langchain_community.vectorstores import VDMS\n",
@@ -290,7 +294,7 @@
    "source": [
     "# add data\n",
     "collection_name = \"my_collection_faiss_L2\"\n",
-    "db = VDMS.from_documents(\n",
+    "db_FaissFlat = VDMS.from_documents(\n",
     "    docs,\n",
     "    client=vdms_client,\n",
     "    ids=ids,\n",
@@ -301,7 +305,7 @@
     "# Query (No metadata filtering)\n",
     "k = 3\n",
     "query = \"What did the president say about Ketanji Brown Jackson\"\n",
-    "returned_docs = db.similarity_search(query, k=k, filter=None)\n",
+    "returned_docs = db_FaissFlat.similarity_search(query, k=k, filter=None)\n",
     "print_results(returned_docs, score=False)"
    ]
   },
@@ -392,25 +396,24 @@
     "k = 3\n",
     "constraints = {\"page_number\": [\">\", 30], \"president_included\": [\"==\", True]}\n",
     "query = \"What did the president say about Ketanji Brown Jackson\"\n",
-    "returned_docs = db.similarity_search(query, k=k, filter=constraints)\n",
+    "returned_docs = db_FaissFlat.similarity_search(query, k=k, filter=constraints)\n",
     "print_results(returned_docs, score=False)"
    ]
   },
   {
    "cell_type": "markdown",
-   "id": "a5984766",
+   "id": "92ab3370",
    "metadata": {},
    "source": [
-    "### Similarity Search using TileDBDense and Euclidean Distance\n",
+    "### Similarity Search using Faiss IVFFlat and Inner Product (IP) Distance\n",
     "\n",
-    "In this section, we add the documents to VDMS using TileDB Dense indexing and L2 as the distance metric for similarity search. We search for three documents (`k=3`) related to the query `What did the president say about Ketanji Brown Jackson` and also return the score along with the document.\n",
-    "\n"
+    "In this section, we add the documents to VDMS using Faiss IndexIVFFlat indexing and IP as the distance metric for similarity search. We search for three documents (`k=3`) related to the query `What did the president say about Ketanji Brown Jackson` and also return the score along with the document.\n"
    ]
   },
   {
    "cell_type": "code",
    "execution_count": 8,
-   "id": "3001ba6e",
+   "id": "78f502cf",
    "metadata": {},
    "outputs": [
     {
@@ -419,7 +422,7 @@
      "text": [
       "--------------------------------------------------\n",
       "\n",
-      "Score:\t1.2032090425491333\n",
+      "Score:\t1.2032090425\n",
       "\n",
       "Content:\n",
       "\tTonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while you’re at it, pass the Disclose Act so Americans can know who is funding our elections. \n",
@@ -437,7 +440,7 @@
       "\tsource:\t../../how_to/state_of_the_union.txt\n",
       "--------------------------------------------------\n",
       "\n",
-      "Score:\t1.495247483253479\n",
+      "Score:\t1.4952471256\n",
       "\n",
       "Content:\n",
       "\tAs Frances Haugen, who is here with us tonight, has shown, we must hold social media platforms accountable for the national experiment they’re conducting on our children for profit. \n",
@@ -463,7 +466,7 @@
       "\tsource:\t../../how_to/state_of_the_union.txt\n",
       "--------------------------------------------------\n",
       "\n",
-      "Score:\t1.5008409023284912\n",
+      "Score:\t1.5008399487\n",
       "\n",
       "Content:\n",
       "\tA former top litigator in private practice. A former federal public defender. And from a family of public school educators and police officers. A consensus builder. Since she’s been nominated, she’s received a broad range of support—from the Fraternal Order of Police to former judges appointed by Democrats and Republicans. \n",
@@ -489,36 +492,36 @@
     }
    ],
    "source": [
-    "db_tiledbD = VDMS.from_documents(\n",
+    "db_FaissIVFFlat = VDMS.from_documents(\n",
     "    docs,\n",
     "    client=vdms_client,\n",
     "    ids=ids,\n",
-    "    collection_name=\"my_collection_tiledbD_L2\",\n",
+    "    collection_name=\"my_collection_FaissIVFFlat_IP\",\n",
     "    embedding=embedding,\n",
-    "    engine=\"TileDBDense\",\n",
-    "    distance_strategy=\"L2\",\n",
+    "    engine=\"FaissIVFFlat\",\n",
+    "    distance_strategy=\"IP\",\n",
     ")\n",
-    "\n",
+    "# Query\n",
     "k = 3\n",
     "query = \"What did the president say about Ketanji Brown Jackson\"\n",
-    "docs_with_score = db_tiledbD.similarity_search_with_score(query, k=k, filter=None)\n",
+    "docs_with_score = db_FaissIVFFlat.similarity_search_with_score(query, k=k, filter=None)\n",
     "print_results(docs_with_score)"
    ]
   },
   {
    "cell_type": "markdown",
-   "id": "92ab3370",
+   "id": "e66d9125",
    "metadata": {},
    "source": [
-    "### Similarity Search using Faiss IVFFlat and Euclidean Distance\n",
+    "### Similarity Search using FLINNG and IP Distance\n",
     "\n",
-    "In this section, we add the documents to VDMS using Faiss IndexIVFFlat indexing and L2 as the distance metric for similarity search. We search for three documents (`k=3`) related to the query `What did the president say about Ketanji Brown Jackson` and also return the score along with the document.\n"
+    "In this section, we add the documents to VDMS using Filters to Identify Near-Neighbor Groups (FLINNG) indexing and IP as the distance metric for similarity search. We search for three documents (`k=3`) related to the query `What did the president say about Ketanji Brown Jackson` and also return the score along with the document."
    ]
   },
   {
    "cell_type": "code",
    "execution_count": 9,
-   "id": "78f502cf",
+   "id": "add81beb",
    "metadata": {},
    "outputs": [
     {
@@ -527,7 +530,7 @@
      "text": [
       "--------------------------------------------------\n",
       "\n",
-      "Score:\t1.2032090425491333\n",
+      "Score:\t1.2032090425\n",
       "\n",
       "Content:\n",
       "\tTonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while you’re at it, pass the Disclose Act so Americans can know who is funding our elections. \n",
@@ -545,7 +548,7 @@
       "\tsource:\t../../how_to/state_of_the_union.txt\n",
       "--------------------------------------------------\n",
       "\n",
-      "Score:\t1.495247483253479\n",
+      "Score:\t1.4952471256\n",
       "\n",
       "Content:\n",
       "\tAs Frances Haugen, who is here with us tonight, has shown, we must hold social media platforms accountable for the national experiment they’re conducting on our children for profit. \n",
@@ -571,7 +574,7 @@
       "\tsource:\t../../how_to/state_of_the_union.txt\n",
       "--------------------------------------------------\n",
       "\n",
-      "Score:\t1.5008409023284912\n",
+      "Score:\t1.5008399487\n",
       "\n",
       "Content:\n",
       "\tA former top litigator in private practice. A former federal public defender. And from a family of public school educators and police officers. A consensus builder. Since she’s been nominated, she’s received a broad range of support—from the Fraternal Order of Police to former judges appointed by Democrats and Republicans. \n",
@@ -597,19 +600,128 @@
     }
    ],
    "source": [
-    "db_FaissIVFFlat = VDMS.from_documents(\n",
+    "db_Flinng = VDMS.from_documents(\n",
     "    docs,\n",
     "    client=vdms_client,\n",
     "    ids=ids,\n",
-    "    collection_name=\"my_collection_FaissIVFFlat_L2\",\n",
+    "    collection_name=\"my_collection_Flinng_IP\",\n",
     "    embedding=embedding,\n",
-    "    engine=\"FaissIVFFlat\",\n",
-    "    distance_strategy=\"L2\",\n",
+    "    engine=\"Flinng\",\n",
+    "    distance_strategy=\"IP\",\n",
     ")\n",
     "# Query\n",
     "k = 3\n",
     "query = \"What did the president say about Ketanji Brown Jackson\"\n",
-    "docs_with_score = db_FaissIVFFlat.similarity_search_with_score(query, k=k, filter=None)\n",
+    "docs_with_score = db_Flinng.similarity_search_with_score(query, k=k, filter=None)\n",
+    "print_results(docs_with_score)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "a5984766",
+   "metadata": {},
+   "source": [
+    "### Similarity Search using TileDBDense and Euclidean Distance\n",
+    "\n",
+    "In this section, we add the documents to VDMS using TileDB Dense indexing and L2 as the distance metric for similarity search. We search for three documents (`k=3`) related to the query `What did the president say about Ketanji Brown Jackson` and also return the score along with the document.\n",
+    "\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 10,
+   "id": "3001ba6e",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "--------------------------------------------------\n",
+      "\n",
+      "Score:\t1.2032090425\n",
+      "\n",
+      "Content:\n",
+      "\tTonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while you’re at it, pass the Disclose Act so Americans can know who is funding our elections. \n",
+      "\n",
+      "Tonight, I’d like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \n",
+      "\n",
+      "One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \n",
+      "\n",
+      "And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nation’s top legal minds, who will continue Justice Breyer’s legacy of excellence.\n",
+      "\n",
+      "Metadata:\n",
+      "\tid:\t32\n",
+      "\tpage_number:\t32\n",
+      "\tpresident_included:\tTrue\n",
+      "\tsource:\t../../how_to/state_of_the_union.txt\n",
+      "--------------------------------------------------\n",
+      "\n",
+      "Score:\t1.4952471256\n",
+      "\n",
+      "Content:\n",
+      "\tAs Frances Haugen, who is here with us tonight, has shown, we must hold social media platforms accountable for the national experiment they’re conducting on our children for profit. \n",
+      "\n",
+      "It’s time to strengthen privacy protections, ban targeted advertising to children, demand tech companies stop collecting personal data on our children. \n",
+      "\n",
+      "And let’s get all Americans the mental health services they need. More people they can turn to for help, and full parity between physical and mental health care. \n",
+      "\n",
+      "Third, support our veterans. \n",
+      "\n",
+      "Veterans are the best of us. \n",
+      "\n",
+      "I’ve always believed that we have a sacred obligation to equip all those we send to war and care for them and their families when they come home. \n",
+      "\n",
+      "My administration is providing assistance with job training and housing, and now helping lower-income veterans get VA care debt-free.  \n",
+      "\n",
+      "Our troops in Iraq and Afghanistan faced many dangers.\n",
+      "\n",
+      "Metadata:\n",
+      "\tid:\t37\n",
+      "\tpage_number:\t37\n",
+      "\tpresident_included:\tFalse\n",
+      "\tsource:\t../../how_to/state_of_the_union.txt\n",
+      "--------------------------------------------------\n",
+      "\n",
+      "Score:\t1.5008399487\n",
+      "\n",
+      "Content:\n",
+      "\tA former top litigator in private practice. A former federal public defender. And from a family of public school educators and police officers. A consensus builder. Since she’s been nominated, she’s received a broad range of support—from the Fraternal Order of Police to former judges appointed by Democrats and Republicans. \n",
+      "\n",
+      "And if we are to advance liberty and justice, we need to secure the Border and fix the immigration system. \n",
+      "\n",
+      "We can do both. At our border, we’ve installed new technology like cutting-edge scanners to better detect drug smuggling.  \n",
+      "\n",
+      "We’ve set up joint patrols with Mexico and Guatemala to catch more human traffickers.  \n",
+      "\n",
+      "We’re putting in place dedicated immigration judges so families fleeing persecution and violence can have their cases heard faster. \n",
+      "\n",
+      "We’re securing commitments and supporting partners in South and Central America to host more refugees and secure their own borders.\n",
+      "\n",
+      "Metadata:\n",
+      "\tid:\t33\n",
+      "\tpage_number:\t33\n",
+      "\tpresident_included:\tFalse\n",
+      "\tsource:\t../../how_to/state_of_the_union.txt\n",
+      "--------------------------------------------------\n",
+      "\n"
+     ]
+    }
+   ],
+   "source": [
+    "db_tiledbD = VDMS.from_documents(\n",
+    "    docs,\n",
+    "    client=vdms_client,\n",
+    "    ids=ids,\n",
+    "    collection_name=\"my_collection_tiledbD_L2\",\n",
+    "    embedding=embedding,\n",
+    "    engine=\"TileDBDense\",\n",
+    "    distance_strategy=\"L2\",\n",
+    ")\n",
+    "\n",
+    "k = 3\n",
+    "query = \"What did the president say about Ketanji Brown Jackson\"\n",
+    "docs_with_score = db_tiledbD.similarity_search_with_score(query, k=k, filter=None)\n",
     "print_results(docs_with_score)"
    ]
   },
@@ -622,12 +734,12 @@
     "\n",
     "While building toward a real application, you want to go beyond adding data, and also update and delete data.\n",
     "\n",
-    "Here is a basic example showing how to do so.  First, we will update the metadata for the document most relevant to the query."
+    "Here is a basic example showing how to do so.  First, we will update the metadata for the document most relevant to the query by adding a date. "
    ]
   },
   {
    "cell_type": "code",
-   "execution_count": 10,
+   "execution_count": 11,
    "id": "81a02810",
    "metadata": {},
    "outputs": [
@@ -638,7 +750,7 @@
       "Original metadata: \n",
       "\t{'id': '32', 'page_number': 32, 'president_included': True, 'source': '../../how_to/state_of_the_union.txt'}\n",
       "new metadata: \n",
-      "\t{'id': '32', 'page_number': 32, 'president_included': True, 'source': '../../how_to/state_of_the_union.txt', 'new_value': 'hello world'}\n",
+      "\t{'id': '32', 'page_number': 32, 'president_included': True, 'source': '../../how_to/state_of_the_union.txt', 'last_date_read': {'_date': '2024-05-01T14:30:00'}}\n",
       "--------------------------------------------------\n",
       "\n",
       "UPDATED ENTRY (id=32):\n",
@@ -655,8 +767,8 @@
       "id:\n",
       "\t32\n",
       "\n",
-      "new_value:\n",
-      "\thello world\n",
+      "last_date_read:\n",
+      "\t2024-05-01T14:30:00+00:00\n",
       "\n",
       "page_number:\n",
       "\t32\n",
@@ -672,19 +784,26 @@
     }
    ],
    "source": [
-    "doc = db.similarity_search(query)[0]\n",
+    "from datetime import datetime\n",
+    "\n",
+    "doc = db_FaissFlat.similarity_search(query)[0]\n",
     "print(f\"Original metadata: \\n\\t{doc.metadata}\")\n",
     "\n",
-    "# update the metadata for a document\n",
-    "doc.metadata[\"new_value\"] = \"hello world\"\n",
+    "# Update the metadata for a document by adding last datetime document read\n",
+    "datetime_str = datetime(2024, 5, 1, 14, 30, 0).isoformat()\n",
+    "doc.metadata[\"last_date_read\"] = {\"_date\": datetime_str}\n",
     "print(f\"new metadata: \\n\\t{doc.metadata}\")\n",
     "print(f\"{DELIMITER}\\n\")\n",
     "\n",
     "# Update document in VDMS\n",
     "id_to_update = doc.metadata[\"id\"]\n",
-    "db.update_document(collection_name, id_to_update, doc)\n",
-    "response, response_array = db.get(\n",
-    "    collection_name, constraints={\"id\": [\"==\", id_to_update]}\n",
+    "db_FaissFlat.update_document(collection_name, id_to_update, doc)\n",
+    "response, response_array = db_FaissFlat.get(\n",
+    "    collection_name,\n",
+    "    constraints={\n",
+    "        \"id\": [\"==\", id_to_update],\n",
+    "        \"last_date_read\": [\">=\", {\"_date\": \"2024-05-01T00:00:00\"}],\n",
+    "    },\n",
     ")\n",
     "\n",
     "# Display Results\n",
@@ -702,7 +821,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 11,
+   "execution_count": 12,
    "id": "95537fe8",
    "metadata": {},
    "outputs": [
@@ -716,11 +835,13 @@
     }
    ],
    "source": [
-    "print(\"Documents before deletion: \", db.count(collection_name))\n",
+    "print(\"Documents before deletion: \", db_FaissFlat.count(collection_name))\n",
     "\n",
     "id_to_remove = ids[-1]\n",
-    "db.delete(collection_name=collection_name, ids=[id_to_remove])\n",
-    "print(f\"Documents after deletion (id={id_to_remove}): {db.count(collection_name)}\")"
+    "db_FaissFlat.delete(collection_name=collection_name, ids=[id_to_remove])\n",
+    "print(\n",
+    "    f\"Documents after deletion (id={id_to_remove}): {db_FaissFlat.count(collection_name)}\"\n",
+    ")"
    ]
   },
   {
@@ -739,7 +860,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 12,
+   "execution_count": 13,
    "id": "1db4d6ed",
    "metadata": {},
    "outputs": [
@@ -758,7 +879,7 @@
       "\n",
       "Metadata:\n",
       "\tid:\t32\n",
-      "\tnew_value:\thello world\n",
+      "\tlast_date_read:\t2024-05-01T14:30:00+00:00\n",
       "\tpage_number:\t32\n",
       "\tpresident_included:\tTrue\n",
       "\tsource:\t../../how_to/state_of_the_union.txt\n"
@@ -767,7 +888,7 @@
    ],
    "source": [
     "embedding_vector = embedding.embed_query(query)\n",
-    "returned_docs = db.similarity_search_by_vector(embedding_vector)\n",
+    "returned_docs = db_FaissFlat.similarity_search_by_vector(embedding_vector)\n",
     "\n",
     "# Print Results\n",
     "print_document_details(returned_docs[0])"
@@ -787,7 +908,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 13,
+   "execution_count": 14,
    "id": "2bc0313b",
    "metadata": {},
    "outputs": [
@@ -795,7 +916,7 @@
      "name": "stdout",
      "output_type": "stream",
      "text": [
-      "Returned entry:\n",
+      "Deleted entry:\n",
       "\n",
       "blob:\n",
       "\tTrue\n",
@@ -838,18 +959,18 @@
     }
    ],
    "source": [
-    "response, response_array = db.get(\n",
+    "response, response_array = db_FaissFlat.get(\n",
     "    collection_name,\n",
     "    limit=1,\n",
     "    include=[\"metadata\", \"embeddings\"],\n",
     "    constraints={\"id\": [\"==\", \"2\"]},\n",
     ")\n",
     "\n",
-    "print(\"Returned entry:\")\n",
-    "print_response([response[0][\"FindDescriptor\"][\"entities\"][0]])\n",
-    "\n",
     "# Delete id=2\n",
-    "db.delete(collection_name=collection_name, ids=[\"2\"])"
+    "db_FaissFlat.delete(collection_name=collection_name, ids=[\"2\"])\n",
+    "\n",
+    "print(\"Deleted entry:\")\n",
+    "print_response([response[0][\"FindDescriptor\"][\"entities\"][0]])"
    ]
   },
   {
@@ -869,7 +990,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 14,
+   "execution_count": 15,
    "id": "120f55eb",
    "metadata": {},
    "outputs": [
@@ -888,7 +1009,7 @@
       "\n",
       "Metadata:\n",
       "\tid:\t32\n",
-      "\tnew_value:\thello world\n",
+      "\tlast_date_read:\t2024-05-01T14:30:00+00:00\n",
       "\tpage_number:\t32\n",
       "\tpresident_included:\tTrue\n",
       "\tsource:\t../../how_to/state_of_the_union.txt\n"
@@ -896,7 +1017,7 @@
     }
    ],
    "source": [
-    "retriever = db.as_retriever()\n",
+    "retriever = db_FaissFlat.as_retriever()\n",
     "relevant_docs = retriever.invoke(query)[0]\n",
     "\n",
     "print_document_details(relevant_docs)"
@@ -914,7 +1035,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 15,
+   "execution_count": 16,
    "id": "f00be6d0",
    "metadata": {},
    "outputs": [
@@ -933,7 +1054,7 @@
       "\n",
       "Metadata:\n",
       "\tid:\t32\n",
-      "\tnew_value:\thello world\n",
+      "\tlast_date_read:\t2024-05-01T14:30:00+00:00\n",
       "\tpage_number:\t32\n",
       "\tpresident_included:\tTrue\n",
       "\tsource:\t../../how_to/state_of_the_union.txt\n"
@@ -941,7 +1062,7 @@
     }
    ],
    "source": [
-    "retriever = db.as_retriever(search_type=\"mmr\")\n",
+    "retriever = db_FaissFlat.as_retriever(search_type=\"mmr\")\n",
     "relevant_docs = retriever.invoke(query)[0]\n",
     "\n",
     "print_document_details(relevant_docs)"
@@ -957,7 +1078,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 16,
+   "execution_count": 17,
    "id": "ab911470",
    "metadata": {},
    "outputs": [
@@ -967,7 +1088,7 @@
      "text": [
       "--------------------------------------------------\n",
       "\n",
-      "Score:\t1.2032092809677124\n",
+      "Score:\t1.2032091618\n",
       "\n",
       "Content:\n",
       "\tTonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while you’re at it, pass the Disclose Act so Americans can know who is funding our elections. \n",
@@ -980,13 +1101,13 @@
       "\n",
       "Metadata:\n",
       "\tid:\t32\n",
-      "\tnew_value:\thello world\n",
+      "\tlast_date_read:\t2024-05-01T14:30:00+00:00\n",
       "\tpage_number:\t32\n",
       "\tpresident_included:\tTrue\n",
       "\tsource:\t../../how_to/state_of_the_union.txt\n",
       "--------------------------------------------------\n",
       "\n",
-      "Score:\t1.507053256034851\n",
+      "Score:\t1.50705266\n",
       "\n",
       "Content:\n",
       "\tBut cancer from prolonged exposure to burn pits ravaged Heath’s lungs and body. \n",
@@ -1022,7 +1143,7 @@
     }
    ],
    "source": [
-    "mmr_resp = db.max_marginal_relevance_search_with_score(query, k=2, fetch_k=10)\n",
+    "mmr_resp = db_FaissFlat.max_marginal_relevance_search_with_score(query, k=2, fetch_k=10)\n",
     "print_results(mmr_resp)"
    ]
   },
@@ -1037,7 +1158,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 17,
+   "execution_count": 18,
    "id": "874e7af9",
    "metadata": {},
    "outputs": [
@@ -1051,11 +1172,11 @@
     }
    ],
    "source": [
-    "print(\"Documents before deletion: \", db.count(collection_name))\n",
+    "print(\"Documents before deletion: \", db_FaissFlat.count(collection_name))\n",
     "\n",
-    "db.delete(collection_name=collection_name)\n",
+    "db_FaissFlat.delete(collection_name=collection_name)\n",
     "\n",
-    "print(\"Documents after deletion: \", db.count(collection_name))"
+    "print(\"Documents after deletion: \", db_FaissFlat.count(collection_name))"
    ]
   },
   {
@@ -1068,7 +1189,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 18,
+   "execution_count": 19,
    "id": "08931796",
    "metadata": {},
    "outputs": [
@@ -1097,7 +1218,7 @@
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "0386ea81",
+   "id": "a60725a6",
    "metadata": {},
    "outputs": [],
    "source": []
@@ -1119,7 +1240,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.9.1"
+   "version": "3.11.9"
   }
  },
  "nbformat": 4,
diff --git a/docs/static/img/tool_call.png b/docs/static/img/tool_call.png
new file mode 100644
index 0000000000000..77384d64fffe3
Binary files /dev/null and b/docs/static/img/tool_call.png differ
diff --git a/docs/static/img/tool_calling_flow.png b/docs/static/img/tool_calling_flow.png
new file mode 100644
index 0000000000000..6e8bf2d21c638
Binary files /dev/null and b/docs/static/img/tool_calling_flow.png differ
diff --git a/docs/static/img/tool_invocation.png b/docs/static/img/tool_invocation.png
new file mode 100644
index 0000000000000..d1e395fee1a76
Binary files /dev/null and b/docs/static/img/tool_invocation.png differ
diff --git a/docs/static/img/tool_results.png b/docs/static/img/tool_results.png
new file mode 100644
index 0000000000000..5d1dc222a9542
Binary files /dev/null and b/docs/static/img/tool_results.png differ
diff --git a/libs/community/extended_testing_deps.txt b/libs/community/extended_testing_deps.txt
index b3da4b7eeb51c..8e22d52cf6137 100644
--- a/libs/community/extended_testing_deps.txt
+++ b/libs/community/extended_testing_deps.txt
@@ -86,7 +86,7 @@ tree-sitter>=0.20.2,<0.21
 tree-sitter-languages>=1.8.0,<2
 upstash-redis>=1.1.0,<2
 upstash-ratelimit>=1.1.0,<2
-vdms==0.0.20
+vdms>=0.0.20
 xata>=1.0.0a7,<2
 xmltodict>=0.13.0,<0.14
 nanopq==0.2.1
diff --git a/libs/community/langchain_community/chat_models/__init__.py b/libs/community/langchain_community/chat_models/__init__.py
index af25b60184d78..3d0d47878f419 100644
--- a/libs/community/langchain_community/chat_models/__init__.py
+++ b/libs/community/langchain_community/chat_models/__init__.py
@@ -51,6 +51,7 @@
     from langchain_community.chat_models.deepinfra import (
         ChatDeepInfra,
     )
+    from langchain_community.chat_models.edenai import ChatEdenAI
     from langchain_community.chat_models.ernie import (
         ErnieBotChat,
     )
@@ -182,6 +183,7 @@
     "ChatOctoAI",
     "ChatDatabricks",
     "ChatDeepInfra",
+    "ChatEdenAI",
     "ChatEverlyAI",
     "ChatFireworks",
     "ChatFriendli",
@@ -237,6 +239,7 @@
     "ChatDatabricks": "langchain_community.chat_models.databricks",
     "ChatDeepInfra": "langchain_community.chat_models.deepinfra",
     "ChatEverlyAI": "langchain_community.chat_models.everlyai",
+    "ChatEdenAI": "langchain_community.chat_models.edenai",
     "ChatFireworks": "langchain_community.chat_models.fireworks",
     "ChatFriendli": "langchain_community.chat_models.friendli",
     "ChatGooglePalm": "langchain_community.chat_models.google_palm",
diff --git a/libs/community/langchain_community/chat_models/edenai.py b/libs/community/langchain_community/chat_models/edenai.py
index 3cf1f16eeaf99..384e80d72d0c2 100644
--- a/libs/community/langchain_community/chat_models/edenai.py
+++ b/libs/community/langchain_community/chat_models/edenai.py
@@ -122,8 +122,8 @@ def _format_edenai_messages(messages: List[BaseMessage]) -> Dict[str, Any]:
     system = None
     formatted_messages = []
 
-    human_messages = filter(lambda msg: isinstance(msg, HumanMessage), messages)
-    last_human_message = list(human_messages)[-1] if human_messages else ""
+    human_messages = list(filter(lambda msg: isinstance(msg, HumanMessage), messages))
+    last_human_message = human_messages[-1] if human_messages else ""
 
     tool_results, other_messages = _extract_edenai_tool_results_from_messages(messages)
     for i, message in enumerate(other_messages):
diff --git a/libs/community/langchain_community/document_loaders/__init__.py b/libs/community/langchain_community/document_loaders/__init__.py
index e03f769312786..eb059d6fbeceb 100644
--- a/libs/community/langchain_community/document_loaders/__init__.py
+++ b/libs/community/langchain_community/document_loaders/__init__.py
@@ -411,6 +411,9 @@
     from langchain_community.document_loaders.scrapfly import (
         ScrapflyLoader,
     )
+    from langchain_community.document_loaders.scrapingant import (
+        ScrapingAntLoader,
+    )
     from langchain_community.document_loaders.sharepoint import (
         SharePointLoader,
     )
@@ -666,6 +669,7 @@
     "S3DirectoryLoader": "langchain_community.document_loaders.s3_directory",
     "S3FileLoader": "langchain_community.document_loaders.s3_file",
     "ScrapflyLoader": "langchain_community.document_loaders.scrapfly",
+    "ScrapingAntLoader": "langchain_community.document_loaders.scrapingant",
     "SQLDatabaseLoader": "langchain_community.document_loaders.sql_database",
     "SRTLoader": "langchain_community.document_loaders.srt",
     "SeleniumURLLoader": "langchain_community.document_loaders.url_selenium",
@@ -870,6 +874,7 @@ def __getattr__(name: str) -> Any:
     "S3DirectoryLoader",
     "S3FileLoader",
     "ScrapflyLoader",
+    "ScrapingAntLoader",
     "SQLDatabaseLoader",
     "SRTLoader",
     "SeleniumURLLoader",
diff --git a/libs/community/langchain_community/document_loaders/scrapingant.py b/libs/community/langchain_community/document_loaders/scrapingant.py
new file mode 100644
index 0000000000000..43b3bfd417271
--- /dev/null
+++ b/libs/community/langchain_community/document_loaders/scrapingant.py
@@ -0,0 +1,66 @@
+"""ScrapingAnt Web Extractor."""
+
+import logging
+from typing import Iterator, List, Optional
+
+from langchain_core.document_loaders import BaseLoader
+from langchain_core.documents import Document
+from langchain_core.utils import get_from_env
+
+logger = logging.getLogger(__file__)
+
+
+class ScrapingAntLoader(BaseLoader):
+    """Turn an url to LLM accessible markdown with `ScrapingAnt`.
+
+    For further details, visit: https://docs.scrapingant.com/python-client
+    """
+
+    def __init__(
+        self,
+        urls: List[str],
+        *,
+        api_key: Optional[str] = None,
+        scrape_config: Optional[dict] = None,
+        continue_on_failure: bool = True,
+    ) -> None:
+        """Initialize client.
+
+        Args:
+            urls: List of urls to scrape.
+            api_key: The ScrapingAnt API key. If not specified must have env var
+                SCRAPINGANT_API_KEY set.
+            scrape_config: The scraping config from ScrapingAntClient.markdown_request
+            continue_on_failure: Whether to continue if scraping an url fails.
+        """
+        try:
+            from scrapingant_client import ScrapingAntClient
+        except ImportError:
+            raise ImportError(
+                "`scrapingant-client` package not found,"
+                " run `pip install scrapingant-client`"
+            )
+        if not urls:
+            raise ValueError("URLs must be provided.")
+        api_key = api_key or get_from_env("api_key", "SCRAPINGANT_API_KEY")
+        self.client = ScrapingAntClient(token=api_key)
+        self.urls = urls
+        self.scrape_config = scrape_config
+        self.continue_on_failure = continue_on_failure
+
+    def lazy_load(self) -> Iterator[Document]:
+        """Fetch data from ScrapingAnt."""
+
+        scrape_config = self.scrape_config if self.scrape_config is not None else {}
+        for url in self.urls:
+            try:
+                result = self.client.markdown_request(url=url, **scrape_config)
+                yield Document(
+                    page_content=result.markdown,
+                    metadata={"url": result.url},
+                )
+            except Exception as e:
+                if self.continue_on_failure:
+                    logger.error(f"Error fetching data from {url}, exception: {e}")
+                else:
+                    raise e
diff --git a/libs/community/langchain_community/embeddings/baichuan.py b/libs/community/langchain_community/embeddings/baichuan.py
index 21175fb901521..abd23e8b028f2 100644
--- a/libs/community/langchain_community/embeddings/baichuan.py
+++ b/libs/community/langchain_community/embeddings/baichuan.py
@@ -25,19 +25,34 @@
 class BaichuanTextEmbeddings(BaseModel, Embeddings):
     """Baichuan Text Embedding models.
 
-    To use, you should set the environment variable ``BAICHUAN_API_KEY`` to
-    your API key or pass it as a named parameter to the constructor.
+    Setup:
+        To use, you should set the environment variable ``BAICHUAN_API_KEY`` to
+        your API key or pass it as a named parameter to the constructor.
 
-    Example:
+        .. code-block:: bash
+
+            export BAICHUAN_API_KEY="your-api-key"
+
+    Instantiate:
         .. code-block:: python
 
             from langchain_community.embeddings import BaichuanTextEmbeddings
 
-            baichuan = BaichuanTextEmbeddings(baichuan_api_key="my-api-key")
-    """
+            embeddings = BaichuanTextEmbeddings()
+
+    Embed:
+        .. code-block:: python
+
+            # embed the documents
+            vectors = embeddings.embed_documents([text1, text2, ...])
+
+            # embed the query
+            vectors = embeddings.embed_query(text)
+    """  # noqa: E501
 
     session: Any  #: :meta private:
     model_name: str = Field(default="Baichuan-Text-Embedding", alias="model")
+    """The model used to embed the documents."""
     baichuan_api_key: Optional[SecretStr] = Field(default=None, alias="api_key")
     """Automatically inferred from env var `BAICHUAN_API_KEY` if not provided."""
     chunk_size: int = 16
diff --git a/libs/community/langchain_community/llms/ollama.py b/libs/community/langchain_community/llms/ollama.py
index 01b1ce37e1894..73bf0d8fba644 100644
--- a/libs/community/langchain_community/llms/ollama.py
+++ b/libs/community/langchain_community/llms/ollama.py
@@ -1,5 +1,18 @@
+from __future__ import annotations
+
 import json
-from typing import Any, AsyncIterator, Dict, Iterator, List, Mapping, Optional, Union
+from typing import (
+    Any,
+    AsyncIterator,
+    Callable,
+    Dict,
+    Iterator,
+    List,
+    Mapping,
+    Optional,
+    Tuple,
+    Union,
+)
 
 import aiohttp
 import requests
@@ -132,6 +145,10 @@ class _OllamaCommon(BaseLanguageModel):
     tokens for authentication.
     """
 
+    auth: Union[Callable, Tuple, None] = None
+    """Additional auth tuple or callable to enable Basic/Digest/Custom HTTP Auth.
+    Expects the same format, type and values as requests.request auth parameter."""
+
     @property
     def _default_params(self) -> Dict[str, Any]:
         """Get the default parameters for calling Ollama."""
@@ -237,6 +254,7 @@ def _create_stream(
                 "Content-Type": "application/json",
                 **(self.headers if isinstance(self.headers, dict) else {}),
             },
+            auth=self.auth,
             json=request_payload,
             stream=True,
             timeout=self.timeout,
@@ -300,6 +318,7 @@ async def _acreate_stream(
                     "Content-Type": "application/json",
                     **(self.headers if isinstance(self.headers, dict) else {}),
                 },
+                auth=self.auth,
                 json=request_payload,
                 timeout=self.timeout,
             ) as response:
diff --git a/libs/community/langchain_community/tools/edenai/audio_speech_to_text.py b/libs/community/langchain_community/tools/edenai/audio_speech_to_text.py
index ba2978399ad53..dc772ece593ec 100644
--- a/libs/community/langchain_community/tools/edenai/audio_speech_to_text.py
+++ b/libs/community/langchain_community/tools/edenai/audio_speech_to_text.py
@@ -3,17 +3,21 @@
 import json
 import logging
 import time
-from typing import List, Optional
+from typing import List, Optional, Type
 
 import requests
 from langchain_core.callbacks import CallbackManagerForToolRun
-from langchain_core.pydantic_v1 import validator
+from langchain_core.pydantic_v1 import BaseModel, Field, HttpUrl, validator
 
 from langchain_community.tools.edenai.edenai_base_tool import EdenaiTool
 
 logger = logging.getLogger(__name__)
 
 
+class SpeechToTextInput(BaseModel):
+    query: HttpUrl = Field(description="url of the audio to analyze")
+
+
 class EdenAiSpeechToTextTool(EdenaiTool):
     """Tool that queries the Eden AI Speech To Text API.
 
@@ -23,7 +27,6 @@ class EdenAiSpeechToTextTool(EdenaiTool):
     To use, you should have
     the environment variable ``EDENAI_API_KEY`` set with your API token.
     You can find your token here: https://app.edenai.run/admin/account/settings
-
     """
 
     edenai_api_key: Optional[str] = None
@@ -34,6 +37,7 @@ class EdenAiSpeechToTextTool(EdenaiTool):
         "Useful for when you have to convert audio to text."
         "Input should be a url to an audio file."
     )
+    args_schema: Type[BaseModel] = SpeechToTextInput
     is_async: bool = True
 
     language: Optional[str] = "en"
diff --git a/libs/community/langchain_community/tools/edenai/audio_text_to_speech.py b/libs/community/langchain_community/tools/edenai/audio_text_to_speech.py
index 421d06f5b00b4..e78cf35419b1e 100644
--- a/libs/community/langchain_community/tools/edenai/audio_text_to_speech.py
+++ b/libs/community/langchain_community/tools/edenai/audio_text_to_speech.py
@@ -1,17 +1,21 @@
 from __future__ import annotations
 
 import logging
-from typing import Dict, List, Literal, Optional
+from typing import Dict, List, Literal, Optional, Type
 
 import requests
 from langchain_core.callbacks import CallbackManagerForToolRun
-from langchain_core.pydantic_v1 import Field, root_validator, validator
+from langchain_core.pydantic_v1 import BaseModel, Field, root_validator, validator
 
 from langchain_community.tools.edenai.edenai_base_tool import EdenaiTool
 
 logger = logging.getLogger(__name__)
 
 
+class TextToSpeechInput(BaseModel):
+    query: str = Field(description="text to generate audio from")
+
+
 class EdenAiTextToSpeechTool(EdenaiTool):
     """Tool that queries the Eden AI Text to speech API.
     for api reference check edenai documentation:
@@ -30,6 +34,7 @@ class EdenAiTextToSpeechTool(EdenaiTool):
         """the output is a string representing the URL of the audio file,
         or the path to the downloaded wav file """
     )
+    args_schema: Type[BaseModel] = TextToSpeechInput
 
     language: Optional[str] = "en"
     """
diff --git a/libs/community/langchain_community/tools/edenai/image_explicitcontent.py b/libs/community/langchain_community/tools/edenai/image_explicitcontent.py
index 3dac6622aa149..8ca1d7739cb7f 100644
--- a/libs/community/langchain_community/tools/edenai/image_explicitcontent.py
+++ b/libs/community/langchain_community/tools/edenai/image_explicitcontent.py
@@ -1,15 +1,20 @@
 from __future__ import annotations
 
 import logging
-from typing import Optional
+from typing import Optional, Type
 
 from langchain_core.callbacks import CallbackManagerForToolRun
+from langchain_core.pydantic_v1 import BaseModel, Field, HttpUrl
 
 from langchain_community.tools.edenai.edenai_base_tool import EdenaiTool
 
 logger = logging.getLogger(__name__)
 
 
+class ExplicitImageInput(BaseModel):
+    query: HttpUrl = Field(description="url of the image to analyze")
+
+
 class EdenAiExplicitImageTool(EdenaiTool):
     """Tool that queries the Eden AI Explicit image detection.
 
@@ -33,6 +38,7 @@ class EdenAiExplicitImageTool(EdenaiTool):
         pornography, violence, gore content, etc."""
         "Input should be the string url of the image ."
     )
+    args_schema: Type[BaseModel] = ExplicitImageInput
 
     combine_available: bool = True
     feature: str = "image"
diff --git a/libs/community/langchain_community/tools/edenai/image_objectdetection.py b/libs/community/langchain_community/tools/edenai/image_objectdetection.py
index 03b9fc36e58a3..1098e8e37f6da 100644
--- a/libs/community/langchain_community/tools/edenai/image_objectdetection.py
+++ b/libs/community/langchain_community/tools/edenai/image_objectdetection.py
@@ -1,15 +1,20 @@
 from __future__ import annotations
 
 import logging
-from typing import Optional
+from typing import Optional, Type
 
 from langchain_core.callbacks import CallbackManagerForToolRun
+from langchain_core.pydantic_v1 import BaseModel, Field, HttpUrl
 
 from langchain_community.tools.edenai.edenai_base_tool import EdenaiTool
 
 logger = logging.getLogger(__name__)
 
 
+class ObjectDetectionInput(BaseModel):
+    query: HttpUrl = Field(description="url of the image to analyze")
+
+
 class EdenAiObjectDetectionTool(EdenaiTool):
     """Tool that queries the Eden AI Object detection API.
 
@@ -30,6 +35,7 @@ class EdenAiObjectDetectionTool(EdenaiTool):
         (with bounding boxes) objects in an image """
         "Input should be the string url of the image to identify."
     )
+    args_schema: Type[BaseModel] = ObjectDetectionInput
 
     show_positions: bool = False
 
diff --git a/libs/community/langchain_community/tools/edenai/ocr_identityparser.py b/libs/community/langchain_community/tools/edenai/ocr_identityparser.py
index 75352312e5875..2e208dbb54340 100644
--- a/libs/community/langchain_community/tools/edenai/ocr_identityparser.py
+++ b/libs/community/langchain_community/tools/edenai/ocr_identityparser.py
@@ -1,15 +1,20 @@
 from __future__ import annotations
 
 import logging
-from typing import Optional
+from typing import Optional, Type
 
 from langchain_core.callbacks import CallbackManagerForToolRun
+from langchain_core.pydantic_v1 import BaseModel, Field, HttpUrl
 
 from langchain_community.tools.edenai.edenai_base_tool import EdenaiTool
 
 logger = logging.getLogger(__name__)
 
 
+class IDParsingInput(BaseModel):
+    query: HttpUrl = Field(description="url of the document to parse")
+
+
 class EdenAiParsingIDTool(EdenaiTool):
     """Tool that queries the Eden AI  Identity parsing API.
 
@@ -29,6 +34,7 @@ class EdenAiParsingIDTool(EdenaiTool):
         "Useful for when you have to extract information from an ID Document "
         "Input should be the string url of the document to parse."
     )
+    args_schema: Type[BaseModel] = IDParsingInput
 
     feature: str = "ocr"
     subfeature: str = "identity_parser"
diff --git a/libs/community/langchain_community/tools/edenai/ocr_invoiceparser.py b/libs/community/langchain_community/tools/edenai/ocr_invoiceparser.py
index 4413beedf7bc1..75c8425154aa2 100644
--- a/libs/community/langchain_community/tools/edenai/ocr_invoiceparser.py
+++ b/libs/community/langchain_community/tools/edenai/ocr_invoiceparser.py
@@ -1,15 +1,20 @@
 from __future__ import annotations
 
 import logging
-from typing import Optional
+from typing import Optional, Type
 
 from langchain_core.callbacks import CallbackManagerForToolRun
+from langchain_core.pydantic_v1 import BaseModel, Field, HttpUrl
 
 from langchain_community.tools.edenai.edenai_base_tool import EdenaiTool
 
 logger = logging.getLogger(__name__)
 
 
+class InvoiceParsingInput(BaseModel):
+    query: HttpUrl = Field(description="url of the document to parse")
+
+
 class EdenAiParsingInvoiceTool(EdenaiTool):
     """Tool that queries the Eden AI Invoice parsing API.
 
@@ -23,7 +28,6 @@ class EdenAiParsingInvoiceTool(EdenaiTool):
     """
 
     name: str = "edenai_invoice_parsing"
-
     description: str = (
         "A wrapper around edenai Services invoice parsing. "
         """Useful for when you have to extract information from 
@@ -33,6 +37,7 @@ class EdenAiParsingInvoiceTool(EdenaiTool):
         in a structured format to automate the invoice processing """
         "Input should be the string url of the document to parse."
     )
+    args_schema: Type[BaseModel] = InvoiceParsingInput
 
     language: Optional[str] = None
     """
diff --git a/libs/community/langchain_community/tools/edenai/text_moderation.py b/libs/community/langchain_community/tools/edenai/text_moderation.py
index 2486287fba14c..9aed36f0b7343 100644
--- a/libs/community/langchain_community/tools/edenai/text_moderation.py
+++ b/libs/community/langchain_community/tools/edenai/text_moderation.py
@@ -1,15 +1,20 @@
 from __future__ import annotations
 
 import logging
-from typing import Optional
+from typing import Optional, Type
 
 from langchain_core.callbacks import CallbackManagerForToolRun
+from langchain_core.pydantic_v1 import BaseModel, Field
 
 from langchain_community.tools.edenai.edenai_base_tool import EdenaiTool
 
 logger = logging.getLogger(__name__)
 
 
+class TextModerationInput(BaseModel):
+    query: str = Field(description="Text to moderate")
+
+
 class EdenAiTextModerationTool(EdenaiTool):
     """Tool that queries the Eden AI Explicit text detection.
 
@@ -23,7 +28,6 @@ class EdenAiTextModerationTool(EdenaiTool):
     """
 
     name: str = "edenai_explicit_content_detection_text"
-
     description: str = (
         "A wrapper around edenai Services explicit content detection for text. "
         """Useful for when you have to scan text for offensive, 
@@ -44,6 +48,7 @@ class EdenAiTextModerationTool(EdenaiTool):
         """
         "Input should be a string."
     )
+    args_schema: Type[BaseModel] = TextModerationInput
 
     language: str
 
diff --git a/libs/community/langchain_community/utilities/requests.py b/libs/community/langchain_community/utilities/requests.py
index ba8756eefb3e6..58c7b63a35cf5 100644
--- a/libs/community/langchain_community/utilities/requests.py
+++ b/libs/community/langchain_community/utilities/requests.py
@@ -84,7 +84,7 @@ async def _arequest(
                     url,
                     headers=self.headers,
                     auth=self.auth,
-                    verify=self.verify,
+                    verify_ssl=self.verify,
                     **kwargs,
                 ) as response:
                     yield response
@@ -94,7 +94,7 @@ async def _arequest(
                 url,
                 headers=self.headers,
                 auth=self.auth,
-                verify=self.verify,
+                verify_ssl=self.verify,
                 **kwargs,
             ) as response:
                 yield response
diff --git a/libs/community/langchain_community/vectorstores/vdms.py b/libs/community/langchain_community/vectorstores/vdms.py
index 6c3bf4183eb6c..face7b1215719 100644
--- a/libs/community/langchain_community/vectorstores/vdms.py
+++ b/libs/community/langchain_community/vectorstores/vdms.py
@@ -2,6 +2,7 @@
 
 import base64
 import logging
+import os
 import uuid
 from copy import deepcopy
 from typing import (
@@ -76,6 +77,41 @@ def _len_check_if_sized(x: Any, y: Any, x_name: str, y_name: str) -> None:
     return
 
 
+def _results_to_docs(results: Any) -> List[Document]:
+    return [doc for doc, _ in _results_to_docs_and_scores(results)]
+
+
+def _results_to_docs_and_scores(results: Any) -> List[Tuple[Document, float]]:
+    final_res: List[Any] = []
+    try:
+        responses, blobs = results[0]
+        if (
+            len(responses) > 0
+            and "FindDescriptor" in responses[0]
+            and "entities" in responses[0]["FindDescriptor"]
+        ):
+            result_entities = responses[0]["FindDescriptor"]["entities"]
+            # result_blobs = blobs
+            for ent in result_entities:
+                distance = round(ent["_distance"], 10)
+                txt_contents = ent["content"]
+                for p in INVALID_DOC_METADATA_KEYS:
+                    if p in ent:
+                        del ent[p]
+                props = {
+                    mkey: mval
+                    for mkey, mval in ent.items()
+                    if mval not in INVALID_METADATA_VALUE
+                }
+
+                final_res.append(
+                    (Document(page_content=txt_contents, metadata=props), distance)
+                )
+    except Exception as e:
+        logger.warn(f"No results returned. Error while parsing results: {e}")
+    return final_res
+
+
 def VDMS_Client(host: str = "localhost", port: int = 55555) -> vdms.vdms:
     """VDMS client for the VDMS server.
 
@@ -122,7 +158,7 @@ class VDMS(VectorStore):
     Example:
         .. code-block:: python
 
-            from langchain_community.embeddings import HuggingFaceEmbeddings
+            from langchain_huggingface import HuggingFaceEmbeddings
             from langchain_community.vectorstores.vdms import VDMS, VDMS_Client
 
             vectorstore = VDMS(
@@ -143,19 +179,20 @@ def __init__(
         distance_strategy: DISTANCE_METRICS = "L2",
         engine: ENGINES = "FaissFlat",
         relevance_score_fn: Optional[Callable[[float], float]] = None,
+        embedding_dimensions: Optional[int] = None,
     ) -> None:
         # Check required parameters
         self._client = client
         self.similarity_search_engine = engine
         self.distance_strategy = distance_strategy
         self.embedding = embedding
-        self._check_required_inputs(collection_name)
+        self._check_required_inputs(collection_name, embedding_dimensions)
 
         # Update other parameters
         self.override_relevance_score_fn = relevance_score_fn
 
         # Initialize collection
-        self._collection_name = self.__add_set(
+        self._collection_name = self.add_set(
             collection_name,
             engine=self.similarity_search_engine,
             metric=self.distance_strategy,
@@ -173,6 +210,14 @@ def _embed_documents(self, texts: List[str]) -> List[List[float]]:
             p_str += " to be an Embeddings object"
             raise ValueError(p_str)
 
+    def _embed_video(self, paths: List[str], **kwargs: Any) -> List[List[float]]:
+        if self.embedding is not None and hasattr(self.embedding, "embed_video"):
+            return self.embedding.embed_video(paths=paths, **kwargs)
+        else:
+            raise ValueError(
+                "Must provide `embedding` which has attribute `embed_video`"
+            )
+
     def _embed_image(self, uris: List[str]) -> List[List[float]]:
         if self.embedding is not None and hasattr(self.embedding, "embed_image"):
             return self.embedding.embed_image(uris=uris)
@@ -225,10 +270,10 @@ def _similarity_search_with_relevance_scores(
         if self.override_relevance_score_fn is None:
             kwargs["normalize_distance"] = True
         docs_and_scores = self.similarity_search_with_score(
-            query,
-            k,
-            fetch_k,
-            filter,
+            query=query,
+            k=k,
+            fetch_k=fetch_k,
+            filter=filter,
             **kwargs,
         )
 
@@ -242,7 +287,7 @@ def _similarity_search_with_relevance_scores(
                 )
         return docs_and_rel_scores
 
-    def __add(
+    def add(
         self,
         collection_name: str,
         texts: List[str],
@@ -275,7 +320,7 @@ def __add(
 
         return inserted_ids
 
-    def __add_set(
+    def add_set(
         self,
         collection_name: str,
         engine: ENGINES = "FaissFlat",
@@ -333,6 +378,12 @@ def __delete(
 
         all_queries.append(query)
         response, response_array = self.__run_vdms_query(all_queries, all_blobs)
+
+        # Update/store indices after deletion
+        query = _add_descriptorset(
+            "FindDescriptorSet", collection_name, storeIndex=True
+        )
+        responseSet, _ = self.__run_vdms_query([query], all_blobs)
         return "FindDescriptor" in response[0]
 
     def __get_add_query(
@@ -365,7 +416,7 @@ def __get_add_query(
 
         if metadata:
             props.update(metadata)
-        if document:
+        if document not in [None, ""]:
             props["content"] = document
 
         for k in props.keys():
@@ -515,7 +566,7 @@ def add_images(
 
         Args:
             uris: List of paths to the images to add to the vectorstore.
-            metadatas: Optional list of metadatas associated with the texts.
+            metadatas: Optional list of metadatas associated with the images.
             ids: Optional list of unique IDs.
             batch_size (int): Number of concurrent requests to send to the server.
             add_path: Bool to add image path as metadata
@@ -545,7 +596,7 @@ def add_images(
         else:
             metadatas = [_validate_vdms_properties(m) for m in metadatas]
 
-        self.__from(
+        self.add_from(
             texts=b64_texts,
             embeddings=embeddings,
             ids=ids,
@@ -555,6 +606,62 @@ def add_images(
         )
         return ids
 
+    def add_videos(
+        self,
+        paths: List[str],
+        texts: Optional[List[str]] = None,
+        metadatas: Optional[List[dict]] = None,
+        ids: Optional[List[str]] = None,
+        batch_size: int = 1,
+        add_path: Optional[bool] = True,
+        **kwargs: Any,
+    ) -> List[str]:
+        """Run videos through the embeddings and add to the vectorstore.
+
+        Videos are added as embeddings (AddDescriptor) instead of separate
+        entity (AddVideo) within VDMS to leverage similarity search capability
+
+        Args:
+            paths: List of paths to the videos to add to the vectorstore.
+            metadatas: Optional list of text associated with the videos.
+            metadatas: Optional list of metadatas associated with the videos.
+            ids: Optional list of unique IDs.
+            batch_size (int): Number of concurrent requests to send to the server.
+            add_path: Bool to add video path as metadata
+
+        Returns:
+            List of ids from adding videos into the vectorstore.
+        """
+        if texts is None:
+            texts = ["" for _ in paths]
+
+        if add_path and metadatas:
+            for midx, path in enumerate(paths):
+                metadatas[midx]["video_path"] = path
+        elif add_path:
+            metadatas = []
+            for path in paths:
+                metadatas.append({"video_path": path})
+
+        # Populate IDs
+        ids = ids if ids is not None else [str(uuid.uuid4()) for _ in paths]
+
+        # Set embeddings
+        embeddings = self._embed_video(paths=paths, **kwargs)
+
+        if metadatas is None:
+            metadatas = [{} for _ in paths]
+
+        self.add_from(
+            texts=texts,
+            embeddings=embeddings,
+            ids=ids,
+            metadatas=metadatas,
+            batch_size=batch_size,
+            **kwargs,
+        )
+        return ids
+
     def add_texts(
         self,
         texts: Iterable[str],
@@ -586,7 +693,7 @@ def add_texts(
         else:
             metadatas = [_validate_vdms_properties(m) for m in metadatas]
 
-        inserted_ids = self.__from(
+        inserted_ids = self.add_from(
             texts=texts,
             embeddings=embeddings,
             ids=ids,
@@ -596,7 +703,7 @@ def add_texts(
         )
         return inserted_ids
 
-    def __from(
+    def add_from(
         self,
         texts: List[str],
         embeddings: List[List[float]],
@@ -617,7 +724,7 @@ def __from(
             if metadatas:
                 batch_metadatas = metadatas[start_idx:end_idx]
 
-            result = self.__add(
+            result = self.add(
                 self._collection_name,
                 embeddings=batch_embedding_vectors,
                 texts=batch_texts,
@@ -633,7 +740,9 @@ def __from(
         )
         return inserted_ids
 
-    def _check_required_inputs(self, collection_name: str) -> None:
+    def _check_required_inputs(
+        self, collection_name: str, embedding_dimensions: Union[int, None]
+    ) -> None:
         # Check connection to client
         if not self._client.is_connected():
             raise ValueError(
@@ -656,7 +765,29 @@ def _check_required_inputs(self, collection_name: str) -> None:
         if self.embedding is None:
             raise ValueError("Must provide embedding function")
 
-        self.embedding_dimension = len(self._embed_query("This is a sample sentence."))
+        if embedding_dimensions is not None:
+            self.embedding_dimension = embedding_dimensions
+        elif self.embedding is not None and hasattr(self.embedding, "embed_query"):
+            self.embedding_dimension = len(
+                self._embed_query("This is a sample sentence.")
+            )
+        elif self.embedding is not None and (
+            hasattr(self.embedding, "embed_image")
+            or hasattr(self.embedding, "embed_video")
+        ):
+            if hasattr(self.embedding, "model"):
+                try:
+                    self.embedding_dimension = (
+                        self.embedding.model.token_embedding.embedding_dim
+                    )
+                except ValueError:
+                    raise ValueError(
+                        "Embedding dimension needed. Please define embedding_dimensions"
+                    )
+            else:
+                raise ValueError(
+                    "Embedding dimension needed. Please define embedding_dimensions"
+                )
 
         # Check for properties
         current_props = self.__get_properties(collection_name)
@@ -727,7 +858,7 @@ def get_k_candidates(
         )
         response, response_array = self.__run_vdms_query([query], all_blobs)
 
-        if normalize:
+        if normalize and command_str in response[0]:
             max_dist = response[0][command_str]["entities"][-1]["_distance"]
 
         return response, response_array, max_dist
@@ -769,14 +900,21 @@ def get_descriptor_response(
                 results=results,
             )
             response, response_array = self.__run_vdms_query([query])
-            ids_of_interest = [
-                ent["id"] for ent in response[0][command_str]["entities"]
-            ]
+            if command_str in response[0] and response[0][command_str]["returned"] > 0:
+                ids_of_interest = [
+                    ent["id"] for ent in response[0][command_str]["entities"]
+                ]
+            else:
+                return [], []
 
             # (2) Find top fetch_k results
             response, response_array, max_dist = self.get_k_candidates(
                 setname, fetch_k, results, all_blobs, normalize=normalize_distance
             )
+            if command_str not in response[0] or (
+                command_str in response[0] and response[0][command_str]["returned"] == 0
+            ):
+                return [], []
 
             # (3) Intersection of (1) & (2) using ids
             new_entities: List[Dict] = []
@@ -792,7 +930,7 @@ def get_descriptor_response(
                 print(p_str)  # noqa: T201
 
         if normalize_distance:
-            max_dist = 1.0 if max_dist == 0 else max_dist
+            max_dist = 1.0 if max_dist in [0, np.inf] else max_dist
             for ent_idx, ent in enumerate(response[0][command_str]["entities"]):
                 ent["_distance"] = ent["_distance"] / max_dist
                 response[0][command_str]["entities"][ent_idx]["_distance"] = ent[
@@ -946,7 +1084,7 @@ def max_marginal_relevance_search(
         among selected documents.
 
         Args:
-            query: Text to look up documents similar to.
+            query (str): Query to look up. Text or path for image or video.
             k: Number of Documents to return. Defaults to 4.
             fetch_k: Number of Documents to fetch to pass to MMR algorithm.
             lambda_mult: Number between 0 and 1 that determines the degree
@@ -963,7 +1101,20 @@ def max_marginal_relevance_search(
                 "For MMR search, you must specify an embedding function on" "creation."
             )
 
-        embedding_vector: List[float] = self._embed_query(query)
+        # embedding_vector: List[float] = self._embed_query(query)
+        embedding_vector: List[float]
+        if not os.path.isfile(query) and hasattr(self.embedding, "embed_query"):
+            embedding_vector = self._embed_query(query)
+        elif os.path.isfile(query) and hasattr(self.embedding, "embed_image"):
+            embedding_vector = self._embed_image(uris=[query])[0]
+        elif os.path.isfile(query) and hasattr(self.embedding, "embed_video"):
+            embedding_vector = self._embed_video(paths=[query])[0]
+        else:
+            error_msg = f"Could not generate embedding for query '{query}'."
+            error_msg += "If using path for image or video, verify embedding model "
+            error_msg += "has callable functions 'embed_image' or 'embed_video'."
+            raise ValueError(error_msg)
+
         docs = self.max_marginal_relevance_search_by_vector(
             embedding_vector,
             k,
@@ -1006,19 +1157,27 @@ def max_marginal_relevance_search_by_vector(
             include=["metadatas", "documents", "distances", "embeddings"],
         )
 
-        embedding_list = [list(_bytes2embedding(result)) for result in results[0][1]]
+        if len(results[0][1]) == 0:
+            # No results returned
+            return []
+        else:
+            embedding_list = [
+                list(_bytes2embedding(result)) for result in results[0][1]
+            ]
 
-        mmr_selected = maximal_marginal_relevance(
-            np.array(embedding, dtype=np.float32),
-            embedding_list,
-            k=k,
-            lambda_mult=lambda_mult,
-        )
+            mmr_selected = maximal_marginal_relevance(
+                np.array(embedding, dtype=np.float32),
+                embedding_list,
+                k=k,
+                lambda_mult=lambda_mult,
+            )
 
-        candidates = _results_to_docs(results)
+            candidates = _results_to_docs(results)
 
-        selected_results = [r for i, r in enumerate(candidates) if i in mmr_selected]
-        return selected_results
+            selected_results = [
+                r for i, r in enumerate(candidates) if i in mmr_selected
+            ]
+            return selected_results
 
     def max_marginal_relevance_search_with_score(
         self,
@@ -1034,7 +1193,7 @@ def max_marginal_relevance_search_with_score(
         among selected documents.
 
         Args:
-            query: Text to look up documents similar to.
+            query (str): Query to look up. Text or path for image or video.
             k: Number of Documents to return. Defaults to 4.
             fetch_k: Number of Documents to fetch to pass to MMR algorithm.
             lambda_mult: Number between 0 and 1 that determines the degree
@@ -1051,7 +1210,18 @@ def max_marginal_relevance_search_with_score(
                 "For MMR search, you must specify an embedding function on" "creation."
             )
 
-        embedding = self._embed_query(query)
+        if not os.path.isfile(query) and hasattr(self.embedding, "embed_query"):
+            embedding = self._embed_query(query)
+        elif os.path.isfile(query) and hasattr(self.embedding, "embed_image"):
+            embedding = self._embed_image(uris=[query])[0]
+        elif os.path.isfile(query) and hasattr(self.embedding, "embed_video"):
+            embedding = self._embed_video(paths=[query])[0]
+        else:
+            error_msg = f"Could not generate embedding for query '{query}'."
+            error_msg += "If using path for image or video, verify embedding model "
+            error_msg += "has callable functions 'embed_image' or 'embed_video'."
+            raise ValueError(error_msg)
+
         docs = self.max_marginal_relevance_search_with_score_by_vector(
             embedding,
             k,
@@ -1094,21 +1264,27 @@ def max_marginal_relevance_search_with_score_by_vector(
             include=["metadatas", "documents", "distances", "embeddings"],
         )
 
-        embedding_list = [list(_bytes2embedding(result)) for result in results[0][1]]
+        if len(results[0][1]) == 0:
+            # No results returned
+            return []
+        else:
+            embedding_list = [
+                list(_bytes2embedding(result)) for result in results[0][1]
+            ]
 
-        mmr_selected = maximal_marginal_relevance(
-            np.array(embedding, dtype=np.float32),
-            embedding_list,
-            k=k,
-            lambda_mult=lambda_mult,
-        )
+            mmr_selected = maximal_marginal_relevance(
+                np.array(embedding, dtype=np.float32),
+                embedding_list,
+                k=k,
+                lambda_mult=lambda_mult,
+            )
 
-        candidates = _results_to_docs_and_scores(results)
+            candidates = _results_to_docs_and_scores(results)
 
-        selected_results = [
-            (r, s) for i, (r, s) in enumerate(candidates) if i in mmr_selected
-        ]
-        return selected_results
+            selected_results = [
+                (r, s) for i, (r, s) in enumerate(candidates) if i in mmr_selected
+            ]
+            return selected_results
 
     def query_collection_embeddings(
         self,
@@ -1162,7 +1338,7 @@ def similarity_search(
         """Run similarity search with VDMS.
 
         Args:
-            query (str): Query text to search for.
+            query (str): Query to look up. Text or path for image or video.
             k (int): Number of results to return. Defaults to 3.
             fetch_k (int): Number of candidates to fetch for knn (>= k).
             filter (Optional[Dict[str, str]]): Filter by metadata. Defaults to None.
@@ -1171,7 +1347,7 @@ def similarity_search(
             List[Document]: List of documents most similar to the query text.
         """
         docs_and_scores = self.similarity_search_with_score(
-            query, k, fetch_k, filter=filter, **kwargs
+            query, k=k, fetch_k=fetch_k, filter=filter, **kwargs
         )
         return [doc for doc, _ in docs_and_scores]
 
@@ -1213,7 +1389,7 @@ def similarity_search_with_score(
         """Run similarity search with VDMS with distance.
 
         Args:
-            query (str): Query text to search for.
+            query (str): Query to look up. Text or path for image or video.
             k (int): Number of results to return. Defaults to 3.
             fetch_k (int): Number of candidates to fetch for knn (>= k).
             filter (Optional[Dict[str, str]]): Filter by metadata. Defaults to None.
@@ -1226,7 +1402,18 @@ def similarity_search_with_score(
         if self.embedding is None:
             raise ValueError("Must provide embedding function")
         else:
-            query_embedding: List[float] = self._embed_query(query)
+            if not os.path.isfile(query) and hasattr(self.embedding, "embed_query"):
+                query_embedding: List[float] = self._embed_query(query)
+            elif os.path.isfile(query) and hasattr(self.embedding, "embed_image"):
+                query_embedding = self._embed_image(uris=[query])[0]
+            elif os.path.isfile(query) and hasattr(self.embedding, "embed_video"):
+                query_embedding = self._embed_video(paths=[query])[0]
+            else:
+                error_msg = f"Could not generate embedding for query '{query}'."
+                error_msg += "If using path for image or video, verify embedding model "
+                error_msg += "has callable functions 'embed_image' or 'embed_video'."
+                raise ValueError(error_msg)
+
             results = self.query_collection_embeddings(
                 query_embeddings=[query_embedding],
                 n_results=k,
@@ -1256,10 +1443,10 @@ def similarity_search_with_score_by_vector(
 
         Returns:
             List[Tuple[Document, float]]: List of documents most similar to
-            the query text and cosine distance in float for each.
-            Lower score represents more similarity.
+            the query text. Lower score represents more similarity.
         """
-        kwargs["normalize_distance"] = True
+
+        # kwargs["normalize_distance"] = True
 
         results = self.query_collection_embeddings(
             query_embeddings=[embedding],
@@ -1308,37 +1495,6 @@ def update_documents(
 # VDMS UTILITY
 
 
-def _results_to_docs(results: Any) -> List[Document]:
-    return [doc for doc, _ in _results_to_docs_and_scores(results)]
-
-
-def _results_to_docs_and_scores(results: Any) -> List[Tuple[Document, float]]:
-    final_res: List[Any] = []
-    responses, blobs = results[0]
-    if (
-        "FindDescriptor" in responses[0]
-        and "entities" in responses[0]["FindDescriptor"]
-    ):
-        result_entities = responses[0]["FindDescriptor"]["entities"]
-        # result_blobs = blobs
-        for ent in result_entities:
-            distance = ent["_distance"]
-            txt_contents = ent["content"]
-            for p in INVALID_DOC_METADATA_KEYS:
-                if p in ent:
-                    del ent[p]
-            props = {
-                mkey: mval
-                for mkey, mval in ent.items()
-                if mval not in INVALID_METADATA_VALUE
-            }
-
-            final_res.append(
-                (Document(page_content=txt_contents, metadata=props), distance)
-            )
-    return final_res
-
-
 def _add_descriptor(
     command_str: str,
     setname: str,
diff --git a/libs/community/poetry.lock b/libs/community/poetry.lock
index 20115d54cd1e6..3ecb8378e0fa4 100644
--- a/libs/community/poetry.lock
+++ b/libs/community/poetry.lock
@@ -1,4 +1,4 @@
-# This file is automatically @generated by Poetry 1.8.2 and should not be changed by hand.
+# This file is automatically @generated by Poetry 1.8.3 and should not be changed by hand.
 
 [[package]]
 name = "aiohttp"
@@ -2117,7 +2117,7 @@ files = [
 
 [[package]]
 name = "langchain"
-version = "0.2.10"
+version = "0.2.11"
 description = "Building applications with LLMs through composability"
 optional = false
 python-versions = ">=3.8.1,<4.0"
@@ -2127,7 +2127,7 @@ develop = true
 [package.dependencies]
 aiohttp = "^3.8.3"
 async-timeout = {version = "^4.0.0", markers = "python_version < \"3.11\""}
-langchain-core = "^0.2.22"
+langchain-core = "^0.2.23"
 langchain-text-splitters = "^0.2.0"
 langsmith = "^0.1.17"
 numpy = [
@@ -3819,7 +3819,6 @@ files = [
     {file = "PyYAML-6.0.1-cp311-cp311-win_amd64.whl", hash = "sha256:bf07ee2fef7014951eeb99f56f39c9bb4af143d8aa3c21b1677805985307da34"},
     {file = "PyYAML-6.0.1-cp312-cp312-macosx_10_9_x86_64.whl", hash = "sha256:855fb52b0dc35af121542a76b9a84f8d1cd886ea97c84703eaa6d88e37a2ad28"},
     {file = "PyYAML-6.0.1-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:40df9b996c2b73138957fe23a16a4f0ba614f4c0efce1e9406a184b6d07fa3a9"},
-    {file = "PyYAML-6.0.1-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:a08c6f0fe150303c1c6b71ebcd7213c2858041a7e01975da3a99aed1e7a378ef"},
     {file = "PyYAML-6.0.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:6c22bec3fbe2524cde73d7ada88f6566758a8f7227bfbf93a408a9d86bcc12a0"},
     {file = "PyYAML-6.0.1-cp312-cp312-musllinux_1_1_x86_64.whl", hash = "sha256:8d4e9c88387b0f5c7d5f281e55304de64cf7f9c0021a3525bd3b1c542da3b0e4"},
     {file = "PyYAML-6.0.1-cp312-cp312-win32.whl", hash = "sha256:d483d2cdf104e7c9fa60c544d92981f12ad66a457afae824d146093b8c294c54"},
@@ -5358,13 +5357,13 @@ tests = ["Werkzeug (==2.0.3)", "aiohttp", "boto3", "httplib2", "httpx", "pytest"
 
 [[package]]
 name = "vdms"
-version = "0.0.20"
+version = "0.0.21"
 description = "VDMS Client Module"
 optional = false
-python-versions = ">=2.6, !=3.0.*, !=3.1.*, !=3.2.*, <4"
+python-versions = "!=3.0.*,!=3.1.*,!=3.2.*,<4,>=2.6"
 files = [
-    {file = "vdms-0.0.20-py3-none-any.whl", hash = "sha256:7b81127f2981f2dabdcc5880ad7eb4bc2c7833a25aaf79a7b1a560e86bf7b5ec"},
-    {file = "vdms-0.0.20.tar.gz", hash = "sha256:746c21a96e420b9b034495537b42d70f2326b020a1c6907677f7851a926e8605"},
+    {file = "vdms-0.0.21-py3-none-any.whl", hash = "sha256:18e785cd7ec66c3a6c5921a6a93fe2ca22d97f45f40dccb9ff0c954675139daf"},
+    {file = "vdms-0.0.21.tar.gz", hash = "sha256:bbb62d3f1a5cdab6b6bd41950942880cc431729313742870eb255a23c5f0381f"},
 ]
 
 [package.dependencies]
@@ -5759,4 +5758,4 @@ test = ["big-O", "importlib-resources", "jaraco.functools", "jaraco.itertools",
 [metadata]
 lock-version = "2.0"
 python-versions = ">=3.8.1,<4.0"
-content-hash = "14d60e1f61fa9c0ba69cb4e227e4af3de395a8dd4a53b121fe488e7b9f75ea66"
+content-hash = "324e10fe59335abccbd422d9ee8ae771714edf72078a750b99c87ba853bd617c"
diff --git a/libs/community/pyproject.toml b/libs/community/pyproject.toml
index 75572d99db54a..b468b9192cb04 100644
--- a/libs/community/pyproject.toml
+++ b/libs/community/pyproject.toml
@@ -102,7 +102,7 @@ cassio = "^0.1.6"
 tiktoken = ">=0.3.2,<0.6.0"
 anthropic = "^0.3.11"
 fireworks-ai = "^0.9.0"
-vdms = "^0.0.20"
+vdms = ">=0.0.20"
 exllamav2 = "^0.0.18"
 
 [tool.poetry.group.lint.dependencies]
diff --git a/libs/community/tests/integration_tests/vectorstores/test_vdms.py b/libs/community/tests/integration_tests/vectorstores/test_vdms.py
index bce7ff431ba34..a453a0fc20df3 100644
--- a/libs/community/tests/integration_tests/vectorstores/test_vdms.py
+++ b/libs/community/tests/integration_tests/vectorstores/test_vdms.py
@@ -20,6 +20,7 @@
     import vdms
 
 logging.basicConfig(level=logging.DEBUG)
+embedding_function = FakeEmbeddings()
 
 
 # The connection string matches the default settings in the docker-compose file
@@ -28,6 +29,7 @@
 # cd [root]/docker
 # docker compose up -d vdms
 @pytest.fixture
+@pytest.mark.enable_socket
 def vdms_client() -> vdms.vdms:
     return VDMS_Client(
         host=os.getenv("VDMS_DBHOST", "localhost"),
@@ -36,19 +38,19 @@ def vdms_client() -> vdms.vdms:
 
 
 @pytest.mark.requires("vdms")
+@pytest.mark.enable_socket
 def test_init_from_client(vdms_client: vdms.vdms) -> None:
-    embedding_function = FakeEmbeddings()
     _ = VDMS(  # type: ignore[call-arg]
-        embedding_function=embedding_function,
+        embedding=embedding_function,
         client=vdms_client,
     )
 
 
 @pytest.mark.requires("vdms")
+@pytest.mark.enable_socket
 def test_from_texts_with_metadatas(vdms_client: vdms.vdms) -> None:
     """Test end to end construction and search."""
     collection_name = "test_from_texts_with_metadatas"
-    embedding_function = FakeEmbeddings()
     texts = ["foo", "bar", "baz"]
     ids = [f"test_from_texts_with_metadatas_{i}" for i in range(len(texts))]
     metadatas = [{"page": str(i)} for i in range(1, len(texts) + 1)]
@@ -67,10 +69,10 @@ def test_from_texts_with_metadatas(vdms_client: vdms.vdms) -> None:
 
 
 @pytest.mark.requires("vdms")
+@pytest.mark.enable_socket
 def test_from_texts_with_metadatas_with_scores(vdms_client: vdms.vdms) -> None:
     """Test end to end construction and scored search."""
     collection_name = "test_from_texts_with_metadatas_with_scores"
-    embedding_function = FakeEmbeddings()
     texts = ["foo", "bar", "baz"]
     ids = [f"test_from_texts_with_metadatas_with_scores_{i}" for i in range(len(texts))]
     metadatas = [{"page": str(i)} for i in range(1, len(texts) + 1)]
@@ -82,19 +84,19 @@ def test_from_texts_with_metadatas_with_scores(vdms_client: vdms.vdms) -> None:
         collection_name=collection_name,
         client=vdms_client,
     )
-    output = docsearch.similarity_search_with_score("foo", k=1)
+    output = docsearch.similarity_search_with_score("foo", k=1, fetch_k=1)
     assert output == [
         (Document(page_content="foo", metadata={"page": "1", "id": ids[0]}), 0.0)
     ]
 
 
 @pytest.mark.requires("vdms")
+@pytest.mark.enable_socket
 def test_from_texts_with_metadatas_with_scores_using_vector(
     vdms_client: vdms.vdms,
 ) -> None:
     """Test end to end construction and scored search, using embedding vector."""
     collection_name = "test_from_texts_with_metadatas_with_scores_using_vector"
-    embedding_function = FakeEmbeddings()
     texts = ["foo", "bar", "baz"]
     ids = [f"test_from_texts_with_metadatas_{i}" for i in range(len(texts))]
     metadatas = [{"page": str(i)} for i in range(1, len(texts) + 1)]
@@ -113,10 +115,10 @@ def test_from_texts_with_metadatas_with_scores_using_vector(
 
 
 @pytest.mark.requires("vdms")
+@pytest.mark.enable_socket
 def test_search_filter(vdms_client: vdms.vdms) -> None:
     """Test end to end construction and search with metadata filtering."""
     collection_name = "test_search_filter"
-    embedding_function = FakeEmbeddings()
     texts = ["far", "bar", "baz"]
     ids = [f"test_search_filter_{i}" for i in range(len(texts))]
     metadatas = [{"first_letter": "{}".format(text[0])} for text in texts]
@@ -144,10 +146,10 @@ def test_search_filter(vdms_client: vdms.vdms) -> None:
 
 
 @pytest.mark.requires("vdms")
+@pytest.mark.enable_socket
 def test_search_filter_with_scores(vdms_client: vdms.vdms) -> None:
     """Test end to end construction and scored search with metadata filtering."""
     collection_name = "test_search_filter_with_scores"
-    embedding_function = FakeEmbeddings()
     texts = ["far", "bar", "baz"]
     ids = [f"test_search_filter_with_scores_{i}" for i in range(len(texts))]
     metadatas = [{"first_letter": "{}".format(text[0])} for text in texts]
@@ -185,10 +187,10 @@ def test_search_filter_with_scores(vdms_client: vdms.vdms) -> None:
 
 
 @pytest.mark.requires("vdms")
+@pytest.mark.enable_socket
 def test_mmr(vdms_client: vdms.vdms) -> None:
     """Test end to end construction and search."""
     collection_name = "test_mmr"
-    embedding_function = FakeEmbeddings()
     texts = ["foo", "bar", "baz"]
     ids = [f"test_mmr_{i}" for i in range(len(texts))]
     docsearch = VDMS.from_texts(
@@ -203,10 +205,10 @@ def test_mmr(vdms_client: vdms.vdms) -> None:
 
 
 @pytest.mark.requires("vdms")
+@pytest.mark.enable_socket
 def test_mmr_by_vector(vdms_client: vdms.vdms) -> None:
     """Test end to end construction and search."""
     collection_name = "test_mmr_by_vector"
-    embedding_function = FakeEmbeddings()
     texts = ["foo", "bar", "baz"]
     ids = [f"test_mmr_by_vector_{i}" for i in range(len(texts))]
     docsearch = VDMS.from_texts(
@@ -222,10 +224,10 @@ def test_mmr_by_vector(vdms_client: vdms.vdms) -> None:
 
 
 @pytest.mark.requires("vdms")
+@pytest.mark.enable_socket
 def test_with_include_parameter(vdms_client: vdms.vdms) -> None:
     """Test end to end construction and include parameter."""
     collection_name = "test_with_include_parameter"
-    embedding_function = FakeEmbeddings()
     texts = ["foo", "bar", "baz"]
     docsearch = VDMS.from_texts(
         texts=texts,
@@ -233,19 +235,23 @@ def test_with_include_parameter(vdms_client: vdms.vdms) -> None:
         collection_name=collection_name,
         client=vdms_client,
     )
+
     response, response_array = docsearch.get(collection_name, include=["embeddings"])
-    assert response_array != []
+    for emb in embedding_function.embed_documents(texts):
+        assert embedding2bytes(emb) in response_array
+
     response, response_array = docsearch.get(collection_name)
     assert response_array == []
 
 
 @pytest.mark.requires("vdms")
+@pytest.mark.enable_socket
 def test_update_document(vdms_client: vdms.vdms) -> None:
     """Test the update_document function in the VDMS class."""
     collection_name = "test_update_document"
 
     # Make a consistent embedding
-    embedding_function = ConsistentFakeEmbeddings()
+    const_embedding_function = ConsistentFakeEmbeddings()
 
     # Initial document content and id
     initial_content = "foo"
@@ -259,10 +265,10 @@ def test_update_document(vdms_client: vdms.vdms) -> None:
         client=vdms_client,
         collection_name=collection_name,
         documents=[original_doc],
-        embedding=embedding_function,
+        embedding=const_embedding_function,
         ids=[document_id],
     )
-    response, old_embedding = docsearch.get(
+    old_response, old_embedding = docsearch.get(
         collection_name,
         constraints={"id": ["==", document_id]},
         include=["metadata", "embeddings"],
@@ -281,17 +287,15 @@ def test_update_document(vdms_client: vdms.vdms) -> None:
     )
 
     # Perform a similarity search with the updated content
-    output = docsearch.similarity_search(updated_content, k=1)
+    output = docsearch.similarity_search(updated_content, k=3)[0]
 
     # Assert that the updated document is returned by the search
-    assert output == [
-        Document(
-            page_content=updated_content, metadata={"page": "1", "id": document_id}
-        )
-    ]
+    assert output == Document(
+        page_content=updated_content, metadata={"page": "1", "id": document_id}
+    )
 
     # Assert that the new embedding is correct
-    response, new_embedding = docsearch.get(
+    new_response, new_embedding = docsearch.get(
         collection_name,
         constraints={"id": ["==", document_id]},
         include=["metadata", "embeddings"],
@@ -299,16 +303,21 @@ def test_update_document(vdms_client: vdms.vdms) -> None:
     # new_embedding = response_array[0]
 
     assert new_embedding[0] == embedding2bytes(
-        embedding_function.embed_documents([updated_content])[0]
+        const_embedding_function.embed_documents([updated_content])[0]
     )
     assert new_embedding != old_embedding
 
+    assert (
+        new_response[0]["FindDescriptor"]["entities"][0]["content"]
+        != old_response[0]["FindDescriptor"]["entities"][0]["content"]
+    )
+
 
 @pytest.mark.requires("vdms")
+@pytest.mark.enable_socket
 def test_with_relevance_score(vdms_client: vdms.vdms) -> None:
     """Test to make sure the relevance score is scaled to 0-1."""
     collection_name = "test_with_relevance_score"
-    embedding_function = FakeEmbeddings()
     texts = ["foo", "bar", "baz"]
     ids = [f"test_relevance_scores_{i}" for i in range(len(texts))]
     metadatas = [{"page": str(i)} for i in range(1, len(texts) + 1)]
@@ -320,7 +329,7 @@ def test_with_relevance_score(vdms_client: vdms.vdms) -> None:
         collection_name=collection_name,
         client=vdms_client,
     )
-    output = docsearch.similarity_search_with_relevance_scores("foo", k=3)
+    output = docsearch._similarity_search_with_relevance_scores("foo", k=3)
     assert output == [
         (Document(page_content="foo", metadata={"page": "1", "id": ids[0]}), 0.0),
         (Document(page_content="bar", metadata={"page": "2", "id": ids[1]}), 0.25),
@@ -329,24 +338,24 @@ def test_with_relevance_score(vdms_client: vdms.vdms) -> None:
 
 
 @pytest.mark.requires("vdms")
+@pytest.mark.enable_socket
 def test_add_documents_no_metadata(vdms_client: vdms.vdms) -> None:
     collection_name = "test_add_documents_no_metadata"
-    embedding_function = FakeEmbeddings()
     db = VDMS(  # type: ignore[call-arg]
         collection_name=collection_name,
-        embedding_function=embedding_function,
+        embedding=embedding_function,
         client=vdms_client,
     )
     db.add_documents([Document(page_content="foo")])
 
 
 @pytest.mark.requires("vdms")
+@pytest.mark.enable_socket
 def test_add_documents_mixed_metadata(vdms_client: vdms.vdms) -> None:
     collection_name = "test_add_documents_mixed_metadata"
-    embedding_function = FakeEmbeddings()
     db = VDMS(  # type: ignore[call-arg]
         collection_name=collection_name,
-        embedding_function=embedding_function,
+        embedding=embedding_function,
         client=vdms_client,
     )
 
diff --git a/libs/community/tests/unit_tests/chat_models/test_imports.py b/libs/community/tests/unit_tests/chat_models/test_imports.py
index 3c9b5e22547f0..4c46c7203f42e 100644
--- a/libs/community/tests/unit_tests/chat_models/test_imports.py
+++ b/libs/community/tests/unit_tests/chat_models/test_imports.py
@@ -11,6 +11,7 @@
     "ChatDatabricks",
     "ChatDeepInfra",
     "ChatEverlyAI",
+    "ChatEdenAI",
     "ChatFireworks",
     "ChatFriendli",
     "ChatGooglePalm",
diff --git a/libs/community/tests/unit_tests/document_loaders/test_imports.py b/libs/community/tests/unit_tests/document_loaders/test_imports.py
index 5cd9ce3d40430..fbf624f537a1d 100644
--- a/libs/community/tests/unit_tests/document_loaders/test_imports.py
+++ b/libs/community/tests/unit_tests/document_loaders/test_imports.py
@@ -142,6 +142,7 @@
     "S3DirectoryLoader",
     "S3FileLoader",
     "ScrapflyLoader",
+    "ScrapingAntLoader",
     "SQLDatabaseLoader",
     "SRTLoader",
     "SeleniumURLLoader",
diff --git a/libs/community/tests/unit_tests/llms/test_ollama.py b/libs/community/tests/unit_tests/llms/test_ollama.py
index 6d6ce632a8f2e..04da4d137eea7 100644
--- a/libs/community/tests/unit_tests/llms/test_ollama.py
+++ b/libs/community/tests/unit_tests/llms/test_ollama.py
@@ -31,7 +31,7 @@ def test_pass_headers_if_provided(monkeypatch: MonkeyPatch) -> None:
         timeout=300,
     )
 
-    def mock_post(url, headers, json, stream, timeout):  # type: ignore[no-untyped-def]
+    def mock_post(url, headers, json, stream, timeout, auth):  # type: ignore[no-untyped-def]
         assert url == "https://ollama-hostname:8000/api/generate"
         assert headers == {
             "Content-Type": "application/json",
@@ -49,10 +49,35 @@ def mock_post(url, headers, json, stream, timeout):  # type: ignore[no-untyped-d
     llm.invoke("Test prompt")
 
 
+def test_pass_auth_if_provided(monkeypatch: MonkeyPatch) -> None:
+    llm = Ollama(
+        base_url="https://ollama-hostname:8000",
+        model="foo",
+        auth=("Test-User", "Test-Password"),
+        timeout=300,
+    )
+
+    def mock_post(url, headers, json, stream, timeout, auth):  # type: ignore[no-untyped-def]
+        assert url == "https://ollama-hostname:8000/api/generate"
+        assert headers == {
+            "Content-Type": "application/json",
+        }
+        assert json is not None
+        assert stream is True
+        assert timeout == 300
+        assert auth == ("Test-User", "Test-Password")
+
+        return mock_response_stream()
+
+    monkeypatch.setattr(requests, "post", mock_post)
+
+    llm.invoke("Test prompt")
+
+
 def test_handle_if_headers_not_provided(monkeypatch: MonkeyPatch) -> None:
     llm = Ollama(base_url="https://ollama-hostname:8000", model="foo", timeout=300)
 
-    def mock_post(url, headers, json, stream, timeout):  # type: ignore[no-untyped-def]
+    def mock_post(url, headers, json, stream, timeout, auth):  # type: ignore[no-untyped-def]
         assert url == "https://ollama-hostname:8000/api/generate"
         assert headers == {
             "Content-Type": "application/json",
@@ -72,7 +97,7 @@ def test_handle_kwargs_top_level_parameters(monkeypatch: MonkeyPatch) -> None:
     """Test that top level params are sent to the endpoint as top level params"""
     llm = Ollama(base_url="https://ollama-hostname:8000", model="foo", timeout=300)
 
-    def mock_post(url, headers, json, stream, timeout):  # type: ignore[no-untyped-def]
+    def mock_post(url, headers, json, stream, timeout, auth):  # type: ignore[no-untyped-def]
         assert url == "https://ollama-hostname:8000/api/generate"
         assert headers == {
             "Content-Type": "application/json",
@@ -120,7 +145,7 @@ def test_handle_kwargs_with_unknown_param(monkeypatch: MonkeyPatch) -> None:
     """
     llm = Ollama(base_url="https://ollama-hostname:8000", model="foo", timeout=300)
 
-    def mock_post(url, headers, json, stream, timeout):  # type: ignore[no-untyped-def]
+    def mock_post(url, headers, json, stream, timeout, auth):  # type: ignore[no-untyped-def]
         assert url == "https://ollama-hostname:8000/api/generate"
         assert headers == {
             "Content-Type": "application/json",
@@ -169,7 +194,7 @@ def test_handle_kwargs_with_options(monkeypatch: MonkeyPatch) -> None:
     """
     llm = Ollama(base_url="https://ollama-hostname:8000", model="foo", timeout=300)
 
-    def mock_post(url, headers, json, stream, timeout):  # type: ignore[no-untyped-def]
+    def mock_post(url, headers, json, stream, timeout, auth):  # type: ignore[no-untyped-def]
         assert url == "https://ollama-hostname:8000/api/generate"
         assert headers == {
             "Content-Type": "application/json",
diff --git a/libs/core/langchain_core/language_models/chat_models.py b/libs/core/langchain_core/language_models/chat_models.py
index d6f2cd7bef8f2..c2485dbe432fe 100644
--- a/libs/core/langchain_core/language_models/chat_models.py
+++ b/libs/core/langchain_core/language_models/chat_models.py
@@ -60,6 +60,7 @@
     Field,
     root_validator,
 )
+from langchain_core.rate_limiters import BaseRateLimiter
 from langchain_core.runnables import RunnableMap, RunnablePassthrough
 from langchain_core.runnables.config import ensure_config, run_in_executor
 from langchain_core.tracers._streaming import _StreamingCallbackHandler
@@ -210,6 +211,9 @@ class BaseChatModel(BaseLanguageModel[BaseMessage], ABC):
     callback_manager: Optional[BaseCallbackManager] = Field(default=None, exclude=True)
     """[DEPRECATED] Callback manager to add to the run trace."""
 
+    rate_limiter: Optional[BaseRateLimiter] = Field(default=None, exclude=True)
+    """An optional rate limiter to use for limiting the number of requests."""
+
     @root_validator(pre=True)
     def raise_deprecation(cls, values: Dict) -> Dict:
         """Raise deprecation warning if callback_manager is used.
@@ -341,6 +345,10 @@ def stream(
                 batch_size=1,
             )
             generation: Optional[ChatGenerationChunk] = None
+
+            if self.rate_limiter:
+                self.rate_limiter.acquire(blocking=True)
+
             try:
                 for chunk in self._stream(messages, stop=stop, **kwargs):
                     if chunk.message.id is None:
@@ -412,6 +420,9 @@ async def astream(
             batch_size=1,
         )
 
+        if self.rate_limiter:
+            self.rate_limiter.acquire(blocking=True)
+
         generation: Optional[ChatGenerationChunk] = None
         try:
             async for chunk in self._astream(
@@ -742,6 +753,13 @@ def _generate_with_cache(
                 raise ValueError(
                     "Asked to cache, but no cache found at `langchain.cache`."
                 )
+
+        # Apply the rate limiter after checking the cache, since
+        # we usually don't want to rate limit cache lookups, but
+        # we do want to rate limit API requests.
+        if self.rate_limiter:
+            self.rate_limiter.acquire(blocking=True)
+
         # If stream is not explicitly set, check if implicitly requested by
         # astream_events() or astream_log(). Bail out if _stream not implemented
         if type(self)._stream != BaseChatModel._stream and kwargs.pop(
@@ -822,6 +840,13 @@ async def _agenerate_with_cache(
                 raise ValueError(
                     "Asked to cache, but no cache found at `langchain.cache`."
                 )
+
+        # Apply the rate limiter after checking the cache, since
+        # we usually don't want to rate limit cache lookups, but
+        # we do want to rate limit API requests.
+        if self.rate_limiter:
+            self.rate_limiter.acquire(blocking=True)
+
         # If stream is not explicitly set, check if implicitly requested by
         # astream_events() or astream_log(). Bail out if _astream not implemented
         if (
diff --git a/libs/core/langchain_core/rate_limiters.py b/libs/core/langchain_core/rate_limiters.py
new file mode 100644
index 0000000000000..02a8853532982
--- /dev/null
+++ b/libs/core/langchain_core/rate_limiters.py
@@ -0,0 +1,251 @@
+"""Interface for a rate limiter and an in-memory rate limiter."""
+
+from __future__ import annotations
+
+import abc
+import asyncio
+import threading
+import time
+from typing import (
+    Optional,
+)
+
+from langchain_core._api import beta
+
+
+@beta(message="Introduced in 0.2.24. API subject to change.")
+class BaseRateLimiter(abc.ABC):
+    """Base class for rate limiters.
+
+    Usage of the base limiter is through the acquire and aacquire methods depending
+    on whether running in a sync or async context.
+
+    Implementations are free to add a timeout parameter to their initialize method
+    to allow users to specify a timeout for acquiring the necessary tokens when
+    using a blocking call.
+
+    Current limitations:
+
+    - Rate limiting information is not surfaced in tracing or callbacks. This means
+      that the total time it takes to invoke a chat model will encompass both
+      the time spent waiting for tokens and the time spent making the request.
+
+
+    .. versionadded:: 0.2.24
+    """
+
+    @abc.abstractmethod
+    def acquire(self, *, blocking: bool = True) -> bool:
+        """Attempt to acquire the necessary tokens for the rate limiter.
+
+        This method blocks until the required tokens are available if `blocking`
+        is set to True.
+
+        If `blocking` is set to False, the method will immediately return the result
+        of the attempt to acquire the tokens.
+
+        Args:
+            blocking: If True, the method will block until the tokens are available.
+                If False, the method will return immediately with the result of
+                the attempt. Defaults to True.
+
+        Returns:
+           True if the tokens were successfully acquired, False otherwise.
+        """
+
+    @abc.abstractmethod
+    async def aacquire(self, *, blocking: bool = True) -> bool:
+        """Attempt to acquire the necessary tokens for the rate limiter.
+
+        This method blocks until the required tokens are available if `blocking`
+        is set to True.
+
+        If `blocking` is set to False, the method will immediately return the result
+        of the attempt to acquire the tokens.
+
+        Args:
+            blocking: If True, the method will block until the tokens are available.
+                If False, the method will return immediately with the result of
+                the attempt. Defaults to True.
+
+        Returns:
+           True if the tokens were successfully acquired, False otherwise.
+        """
+
+
+@beta(message="Introduced in 0.2.24. API subject to change.")
+class InMemoryRateLimiter(BaseRateLimiter):
+    """An in memory rate limiter based on a token bucket algorithm.
+
+    This is an in memory rate limiter, so it cannot rate limit across
+    different processes.
+
+    The rate limiter only allows time-based rate limiting and does not
+    take into account any information about the input or the output, so it
+    cannot be used to rate limit based on the size of the request.
+
+    It is thread safe and can be used in either a sync or async context.
+
+    The in memory rate limiter is based on a token bucket. The bucket is filled
+    with tokens at a given rate. Each request consumes a token. If there are
+    not enough tokens in the bucket, the request is blocked until there are
+    enough tokens.
+
+    These *tokens* have NOTHING to do with LLM tokens. They are just
+    a way to keep track of how many requests can be made at a given time.
+
+    Current limitations:
+
+    - The rate limiter is not designed to work across different processes. It is
+      an in-memory rate limiter, but it is thread safe.
+    - The rate limiter only supports time-based rate limiting. It does not take
+      into account the size of the request or any other factors.
+
+    Example:
+
+        .. code-block:: python
+
+            from langchain_core import InMemoryRateLimiter
+
+            from langchain_core.runnables import RunnableLambda, InMemoryRateLimiter
+
+            rate_limiter = InMemoryRateLimiter(
+                requests_per_second=100, check_every_n_seconds=0.1, max_bucket_size=10
+            )
+
+            def foo(x: int) -> int:
+                return x
+
+            foo_ = RunnableLambda(foo)
+            chain = rate_limiter | foo_
+            assert chain.invoke(1) == 1
+
+    .. versionadded:: 0.2.24
+    """
+
+    def __init__(
+        self,
+        *,
+        requests_per_second: float = 1,
+        check_every_n_seconds: float = 0.1,
+        max_bucket_size: float = 1,
+    ) -> None:
+        """A rate limiter based on a token bucket.
+
+        These *tokens* have NOTHING to do with LLM tokens. They are just
+        a way to keep track of how many requests can be made at a given time.
+
+        This rate limiter is designed to work in a threaded environment.
+
+        It works by filling up a bucket with tokens at a given rate. Each
+        request consumes a given number of tokens. If there are not enough
+        tokens in the bucket, the request is blocked until there are enough
+        tokens.
+
+        Args:
+            requests_per_second: The number of tokens to add per second to the bucket.
+                Must be at least 1. The tokens represent "credit" that can be used
+                to make requests.
+            check_every_n_seconds: check whether the tokens are available
+                every this many seconds. Can be a float to represent
+                fractions of a second.
+            max_bucket_size: The maximum number of tokens that can be in the bucket.
+                This is used to prevent bursts of requests.
+        """
+        # Number of requests that we can make per second.
+        self.requests_per_second = requests_per_second
+        # Number of tokens in the bucket.
+        self.available_tokens = 0.0
+        self.max_bucket_size = max_bucket_size
+        # A lock to ensure that tokens can only be consumed by one thread
+        # at a given time.
+        self._consume_lock = threading.Lock()
+        # The last time we tried to consume tokens.
+        self.last: Optional[float] = None
+        self.check_every_n_seconds = check_every_n_seconds
+
+    def _consume(self) -> bool:
+        """Try to consume a token.
+
+        Returns:
+            True means that the tokens were consumed, and the caller can proceed to
+            make the request. A False means that the tokens were not consumed, and
+            the caller should try again later.
+        """
+        with self._consume_lock:
+            now = time.time()
+
+            # initialize on first call to avoid a burst
+            if self.last is None:
+                self.last = now
+
+            elapsed = now - self.last
+
+            if elapsed * self.requests_per_second >= 1:
+                self.available_tokens += elapsed * self.requests_per_second
+                self.last = now
+
+            # Make sure that we don't exceed the bucket size.
+            # This is used to prevent bursts of requests.
+            self.available_tokens = min(self.available_tokens, self.max_bucket_size)
+
+            # As long as we have at least one token, we can proceed.
+            if self.available_tokens >= 1:
+                self.available_tokens -= 1
+                return True
+
+            return False
+
+    def acquire(self, *, blocking: bool = True) -> bool:
+        """Attempt to acquire a token from the rate limiter.
+
+        This method blocks until the required tokens are available if `blocking`
+        is set to True.
+
+        If `blocking` is set to False, the method will immediately return the result
+        of the attempt to acquire the tokens.
+
+        Args:
+            blocking: If True, the method will block until the tokens are available.
+                If False, the method will return immediately with the result of
+                the attempt. Defaults to True.
+
+        Returns:
+           True if the tokens were successfully acquired, False otherwise.
+        """
+        if not blocking:
+            return self._consume()
+
+        while not self._consume():
+            time.sleep(self.check_every_n_seconds)
+        return True
+
+    async def aacquire(self, *, blocking: bool = True) -> bool:
+        """Attempt to acquire a token from the rate limiter. Async version.
+
+        This method blocks until the required tokens are available if `blocking`
+        is set to True.
+
+        If `blocking` is set to False, the method will immediately return the result
+        of the attempt to acquire the tokens.
+
+        Args:
+            blocking: If True, the method will block until the tokens are available.
+                If False, the method will return immediately with the result of
+                the attempt. Defaults to True.
+
+        Returns:
+           True if the tokens were successfully acquired, False otherwise.
+        """
+        if not blocking:
+            return self._consume()
+
+        while not self._consume():
+            await asyncio.sleep(self.check_every_n_seconds)
+        return True
+
+
+__all__ = [
+    "BaseRateLimiter",
+    "InMemoryRateLimiter",
+]
diff --git a/libs/core/langchain_core/runnables/graph_png.py b/libs/core/langchain_core/runnables/graph_png.py
index ae3edaa7da088..9cd48b459492c 100644
--- a/libs/core/langchain_core/runnables/graph_png.py
+++ b/libs/core/langchain_core/runnables/graph_png.py
@@ -174,7 +174,9 @@ def add_edges(self, viz: Any, graph: Graph) -> None:
             graph: The graph to draw.
         """
         for start, end, data, cond in graph.edges:
-            self.add_edge(viz, start, end, str(data), cond)
+            self.add_edge(
+                viz, start, end, str(data) if data is not None else None, cond
+            )
 
     def update_styles(self, viz: Any, graph: Graph) -> None:
         """Update the styles of the entrypoint and END nodes.
diff --git a/libs/core/langchain_core/tools.py b/libs/core/langchain_core/tools.py
index 85db1d3b92eb4..633faa7b343e9 100644
--- a/libs/core/langchain_core/tools.py
+++ b/libs/core/langchain_core/tools.py
@@ -20,6 +20,7 @@
 from __future__ import annotations
 
 import asyncio
+import copy
 import functools
 import inspect
 import json
@@ -1481,8 +1482,9 @@ def _prep_run_args(
 ) -> Tuple[Union[str, Dict], Dict]:
     config = ensure_config(config)
     if _is_tool_call(input):
-        tool_call_id: Optional[str] = cast(ToolCall, input)["id"]
-        tool_input: Union[str, dict] = cast(ToolCall, input)["args"]
+        input_copy = copy.deepcopy(input)
+        tool_call_id: Optional[str] = cast(ToolCall, input_copy)["id"]
+        tool_input: Union[str, dict] = cast(ToolCall, input_copy)["args"]
     else:
         tool_call_id = None
         tool_input = cast(Union[str, dict], input)
diff --git a/libs/core/langchain_core/vectorstores/in_memory.py b/libs/core/langchain_core/vectorstores/in_memory.py
index e284d0b509ec6..3ac363f22daf9 100644
--- a/libs/core/langchain_core/vectorstores/in_memory.py
+++ b/libs/core/langchain_core/vectorstores/in_memory.py
@@ -8,7 +8,6 @@
     Any,
     Callable,
     Dict,
-    Iterable,
     List,
     Optional,
     Sequence,
@@ -74,6 +73,27 @@ def upsert(self, items: Sequence[Document], /, **kwargs: Any) -> UpsertResponse:
             "failed": [],
         }
 
+    async def aupsert(
+        self, items: Sequence[Document], /, **kwargs: Any
+    ) -> UpsertResponse:
+        vectors = await self.embedding.aembed_documents(
+            [item.page_content for item in items]
+        )
+        ids = []
+        for item, vector in zip(items, vectors):
+            doc_id = item.id if item.id else str(uuid.uuid4())
+            ids.append(doc_id)
+            self.store[doc_id] = {
+                "id": doc_id,
+                "vector": vector,
+                "text": item.page_content,
+                "metadata": item.metadata,
+            }
+        return {
+            "succeeded": ids,
+            "failed": [],
+        }
+
     def get_by_ids(self, ids: Sequence[str], /) -> List[Document]:
         """Get documents by their ids.
 
@@ -108,14 +128,6 @@ async def aget_by_ids(self, ids: Sequence[str], /) -> List[Document]:
         """
         return self.get_by_ids(ids)
 
-    async def aadd_texts(
-        self,
-        texts: Iterable[str],
-        metadatas: Optional[List[dict]] = None,
-        **kwargs: Any,
-    ) -> List[str]:
-        return self.add_texts(texts, metadatas, **kwargs)
-
     def _similarity_search_with_score_by_vector(
         self,
         embedding: List[float],
@@ -172,7 +184,13 @@ def similarity_search_with_score(
     async def asimilarity_search_with_score(
         self, query: str, k: int = 4, **kwargs: Any
     ) -> List[Tuple[Document, float]]:
-        return self.similarity_search_with_score(query, k, **kwargs)
+        embedding = await self.embedding.aembed_query(query)
+        docs = self.similarity_search_with_score_by_vector(
+            embedding,
+            k,
+            **kwargs,
+        )
+        return docs
 
     def similarity_search_by_vector(
         self,
@@ -200,7 +218,10 @@ def similarity_search(
     async def asimilarity_search(
         self, query: str, k: int = 4, **kwargs: Any
     ) -> List[Document]:
-        return self.similarity_search(query, k, **kwargs)
+        return [
+            doc
+            for doc, _ in await self.asimilarity_search_with_score(query, k, **kwargs)
+        ]
 
     def max_marginal_relevance_search_by_vector(
         self,
@@ -249,6 +270,23 @@ def max_marginal_relevance_search(
             **kwargs,
         )
 
+    async def amax_marginal_relevance_search(
+        self,
+        query: str,
+        k: int = 4,
+        fetch_k: int = 20,
+        lambda_mult: float = 0.5,
+        **kwargs: Any,
+    ) -> List[Document]:
+        embedding_vector = await self.embedding.aembed_query(query)
+        return self.max_marginal_relevance_search_by_vector(
+            embedding_vector,
+            k,
+            fetch_k,
+            lambda_mult=lambda_mult,
+            **kwargs,
+        )
+
     @classmethod
     def from_texts(
         cls,
@@ -271,7 +309,11 @@ async def afrom_texts(
         metadatas: Optional[List[dict]] = None,
         **kwargs: Any,
     ) -> "InMemoryVectorStore":
-        return cls.from_texts(texts, embedding, metadatas, **kwargs)
+        store = cls(
+            embedding=embedding,
+        )
+        await store.aadd_texts(texts=texts, metadatas=metadatas, **kwargs)
+        return store
 
     @classmethod
     def load(
diff --git a/libs/core/tests/unit_tests/language_models/chat_models/test_rate_limiting.py b/libs/core/tests/unit_tests/language_models/chat_models/test_rate_limiting.py
new file mode 100644
index 0000000000000..b15b202f4848e
--- /dev/null
+++ b/libs/core/tests/unit_tests/language_models/chat_models/test_rate_limiting.py
@@ -0,0 +1,258 @@
+import time
+
+from langchain_core.caches import InMemoryCache
+from langchain_core.language_models import GenericFakeChatModel
+from langchain_core.rate_limiters import InMemoryRateLimiter
+
+
+def test_rate_limit_invoke() -> None:
+    """Add rate limiter."""
+
+    model = GenericFakeChatModel(
+        messages=iter(["hello", "world", "!"]),
+        rate_limiter=InMemoryRateLimiter(
+            requests_per_second=200, check_every_n_seconds=0.01, max_bucket_size=10
+        ),
+    )
+    tic = time.time()
+    model.invoke("foo")
+    toc = time.time()
+    # Should be larger than check every n seconds since the token bucket starts
+    # with 0 tokens.
+    assert 0.01 < toc - tic < 0.02
+
+    tic = time.time()
+    model.invoke("foo")
+    toc = time.time()
+    # The second time we call the model, we should have 1 extra token
+    # to proceed immediately.
+    assert toc - tic < 0.005
+
+    # The third time we call the model, we need to wait again for a token
+    tic = time.time()
+    model.invoke("foo")
+    toc = time.time()
+    # Should be larger than check every n seconds since the token bucket starts
+    # with 0 tokens.
+    assert 0.01 < toc - tic < 0.02
+
+
+async def test_rate_limit_ainvoke() -> None:
+    """Add rate limiter."""
+
+    model = GenericFakeChatModel(
+        messages=iter(["hello", "world", "!"]),
+        rate_limiter=InMemoryRateLimiter(
+            requests_per_second=20, check_every_n_seconds=0.1, max_bucket_size=10
+        ),
+    )
+    tic = time.time()
+    await model.ainvoke("foo")
+    toc = time.time()
+    # Should be larger than check every n seconds since the token bucket starts
+    # with 0 tokens.
+    assert 0.1 < toc - tic < 0.2
+
+    tic = time.time()
+    await model.ainvoke("foo")
+    toc = time.time()
+    # The second time we call the model, we should have 1 extra token
+    # to proceed immediately.
+    assert toc - tic < 0.01
+
+    # The third time we call the model, we need to wait again for a token
+    tic = time.time()
+    await model.ainvoke("foo")
+    toc = time.time()
+    # Should be larger than check every n seconds since the token bucket starts
+    # with 0 tokens.
+    assert 0.1 < toc - tic < 0.2
+
+
+def test_rate_limit_batch() -> None:
+    """Test that batch and stream calls work with rate limiters."""
+    model = GenericFakeChatModel(
+        messages=iter(["hello", "world", "!"]),
+        rate_limiter=InMemoryRateLimiter(
+            requests_per_second=200, check_every_n_seconds=0.01, max_bucket_size=10
+        ),
+    )
+    # Need 2 tokens to proceed
+    time_to_fill = 2 / 200.0
+    tic = time.time()
+    model.batch(["foo", "foo"])
+    toc = time.time()
+    # Should be larger than check every n seconds since the token bucket starts
+    # with 0 tokens.
+    assert time_to_fill < toc - tic < time_to_fill + 0.01
+
+
+async def test_rate_limit_abatch() -> None:
+    """Test that batch and stream calls work with rate limiters."""
+    model = GenericFakeChatModel(
+        messages=iter(["hello", "world", "!"]),
+        rate_limiter=InMemoryRateLimiter(
+            requests_per_second=200, check_every_n_seconds=0.01, max_bucket_size=10
+        ),
+    )
+    # Need 2 tokens to proceed
+    time_to_fill = 2 / 200.0
+    tic = time.time()
+    await model.abatch(["foo", "foo"])
+    toc = time.time()
+    # Should be larger than check every n seconds since the token bucket starts
+    # with 0 tokens.
+    assert time_to_fill < toc - tic < time_to_fill + 0.01
+
+
+def test_rate_limit_stream() -> None:
+    """Test rate limit by stream."""
+    model = GenericFakeChatModel(
+        messages=iter(["hello world", "hello world", "hello world"]),
+        rate_limiter=InMemoryRateLimiter(
+            requests_per_second=200, check_every_n_seconds=0.01, max_bucket_size=10
+        ),
+    )
+    # Check astream
+    tic = time.time()
+    response = list(model.stream("foo"))
+    assert [msg.content for msg in response] == ["hello", " ", "world"]
+    toc = time.time()
+    # Should be larger than check every n seconds since the token bucket starts
+    assert 0.01 < toc - tic < 0.02  # Slightly smaller than check every n seconds
+
+    # Second time around we should have 1 token left
+    tic = time.time()
+    response = list(model.stream("foo"))
+    assert [msg.content for msg in response] == ["hello", " ", "world"]
+    toc = time.time()
+    # Should be larger than check every n seconds since the token bucket starts
+    assert toc - tic < 0.005  # Slightly smaller than check every n seconds
+
+    # Third time around we should have 0 tokens left
+    tic = time.time()
+    response = list(model.stream("foo"))
+    assert [msg.content for msg in response] == ["hello", " ", "world"]
+    toc = time.time()
+    # Should be larger than check every n seconds since the token bucket starts
+    assert 0.01 < toc - tic < 0.02  # Slightly smaller than check every n seconds
+
+
+async def test_rate_limit_astream() -> None:
+    """Test rate limiting astream."""
+    rate_limiter = InMemoryRateLimiter(
+        requests_per_second=20, check_every_n_seconds=0.1, max_bucket_size=10
+    )
+    model = GenericFakeChatModel(
+        messages=iter(["hello world", "hello world", "hello world"]),
+        rate_limiter=rate_limiter,
+    )
+    # Check astream
+    tic = time.time()
+    response = [chunk async for chunk in model.astream("foo")]
+    assert [msg.content for msg in response] == ["hello", " ", "world"]
+    toc = time.time()
+    assert 0.1 < toc - tic < 0.2
+
+    # Second time around we should have 1 token left
+    tic = time.time()
+    response = [chunk async for chunk in model.astream("foo")]
+    assert [msg.content for msg in response] == ["hello", " ", "world"]
+    toc = time.time()
+    # Should be larger than check every n seconds since the token bucket starts
+    assert toc - tic < 0.01  # Slightly smaller than check every n seconds
+
+    # Third time around we should have 0 tokens left
+    tic = time.time()
+    response = [chunk async for chunk in model.astream("foo")]
+    assert [msg.content for msg in response] == ["hello", " ", "world"]
+    toc = time.time()
+    assert 0.1 < toc - tic < 0.2
+
+
+def test_rate_limit_skips_cache() -> None:
+    """Test that rate limiting does not rate limit cache look ups."""
+    cache = InMemoryCache()
+    model = GenericFakeChatModel(
+        messages=iter(["hello", "world", "!"]),
+        rate_limiter=InMemoryRateLimiter(
+            requests_per_second=100, check_every_n_seconds=0.01, max_bucket_size=1
+        ),
+        cache=cache,
+    )
+
+    tic = time.time()
+    model.invoke("foo")
+    toc = time.time()
+    # Should be larger than check every n seconds since the token bucket starts
+    # with 0 tokens.
+    assert 0.01 < toc - tic < 0.02
+
+    for _ in range(2):
+        # Cache hits
+        tic = time.time()
+        model.invoke("foo")
+        toc = time.time()
+        # Should be larger than check every n seconds since the token bucket starts
+        # with 0 tokens.
+        assert toc - tic < 0.005
+
+    # Test verifies that there's only a single key
+    # Test also verifies that rate_limiter information is not part of the
+    # cache key
+    assert list(cache._cache) == [
+        (
+            '[{"lc": 1, "type": "constructor", "id": ["langchain", "schema", '
+            '"messages", '
+            '"HumanMessage"], "kwargs": {"content": "foo", "type": "human"}}]',
+            "[('_type', 'generic-fake-chat-model'), ('stop', None)]",
+        )
+    ]
+
+
+class SerializableModel(GenericFakeChatModel):
+    @classmethod
+    def is_lc_serializable(cls) -> bool:
+        return True
+
+
+def test_serialization_with_rate_limiter() -> None:
+    """Test model serialization with rate limiter."""
+    from langchain_core.load import dumps
+
+    model = SerializableModel(
+        messages=iter(["hello", "world", "!"]),
+        rate_limiter=InMemoryRateLimiter(
+            requests_per_second=100, check_every_n_seconds=0.01, max_bucket_size=1
+        ),
+    )
+    serialized_model = dumps(model)
+    assert InMemoryRateLimiter.__name__ not in serialized_model
+
+
+async def test_rate_limit_skips_cache_async() -> None:
+    """Test that rate limiting does not rate limit cache look ups."""
+    cache = InMemoryCache()
+    model = GenericFakeChatModel(
+        messages=iter(["hello", "world", "!"]),
+        rate_limiter=InMemoryRateLimiter(
+            requests_per_second=100, check_every_n_seconds=0.01, max_bucket_size=1
+        ),
+        cache=cache,
+    )
+
+    tic = time.time()
+    await model.ainvoke("foo")
+    toc = time.time()
+    # Should be larger than check every n seconds since the token bucket starts
+    # with 0 tokens.
+    assert 0.01 < toc - tic < 0.02
+
+    for _ in range(2):
+        # Cache hits
+        tic = time.time()
+        await model.ainvoke("foo")
+        toc = time.time()
+        # Should be larger than check every n seconds since the token bucket starts
+        # with 0 tokens.
+        assert toc - tic < 0.005
diff --git a/libs/core/tests/unit_tests/rate_limiters/__init__.py b/libs/core/tests/unit_tests/rate_limiters/__init__.py
new file mode 100644
index 0000000000000..e69de29bb2d1d
diff --git a/libs/core/tests/unit_tests/rate_limiters/test_in_memory_rate_limiter.py b/libs/core/tests/unit_tests/rate_limiters/test_in_memory_rate_limiter.py
new file mode 100644
index 0000000000000..914b9d9426271
--- /dev/null
+++ b/libs/core/tests/unit_tests/rate_limiters/test_in_memory_rate_limiter.py
@@ -0,0 +1,110 @@
+"""Test rate limiter."""
+
+import time
+
+import pytest
+from freezegun import freeze_time
+
+from langchain_core.rate_limiters import InMemoryRateLimiter
+
+
+@pytest.fixture
+def rate_limiter() -> InMemoryRateLimiter:
+    """Return an instance of InMemoryRateLimiter."""
+    return InMemoryRateLimiter(
+        requests_per_second=2, check_every_n_seconds=0.1, max_bucket_size=2
+    )
+
+
+def test_initial_state(rate_limiter: InMemoryRateLimiter) -> None:
+    """Test the initial state of the rate limiter."""
+    assert rate_limiter.available_tokens == 0.0
+
+
+def test_sync_wait(rate_limiter: InMemoryRateLimiter) -> None:
+    with freeze_time("2023-01-01 00:00:00") as frozen_time:
+        rate_limiter.last = time.time()
+        assert not rate_limiter.acquire(blocking=False)
+        frozen_time.tick(0.1)  # Increment by 0.1 seconds
+        assert rate_limiter.available_tokens == 0
+        assert not rate_limiter.acquire(blocking=False)
+        frozen_time.tick(0.1)  # Increment by 0.1 seconds
+        assert rate_limiter.available_tokens == 0
+        assert not rate_limiter.acquire(blocking=False)
+        frozen_time.tick(1.8)
+        assert rate_limiter.acquire(blocking=False)
+        assert rate_limiter.available_tokens == 1.0
+        assert rate_limiter.acquire(blocking=False)
+        assert rate_limiter.available_tokens == 0
+        frozen_time.tick(2.1)
+        assert rate_limiter.acquire(blocking=False)
+        assert rate_limiter.available_tokens == 1
+        frozen_time.tick(0.9)
+        assert rate_limiter.acquire(blocking=False)
+        assert rate_limiter.available_tokens == 1
+
+        # Check max bucket size
+        frozen_time.tick(100)
+        assert rate_limiter.acquire(blocking=False)
+        assert rate_limiter.available_tokens == 1
+
+
+async def test_async_wait(rate_limiter: InMemoryRateLimiter) -> None:
+    with freeze_time("2023-01-01 00:00:00") as frozen_time:
+        rate_limiter.last = time.time()
+        assert not await rate_limiter.aacquire(blocking=False)
+        frozen_time.tick(0.1)  # Increment by 0.1 seconds
+        assert rate_limiter.available_tokens == 0
+        assert not await rate_limiter.aacquire(blocking=False)
+        frozen_time.tick(0.1)  # Increment by 0.1 seconds
+        assert rate_limiter.available_tokens == 0
+        assert not await rate_limiter.aacquire(blocking=False)
+        frozen_time.tick(1.8)
+        assert await rate_limiter.aacquire(blocking=False)
+        assert rate_limiter.available_tokens == 1.0
+        assert await rate_limiter.aacquire(blocking=False)
+        assert rate_limiter.available_tokens == 0
+        frozen_time.tick(2.1)
+        assert await rate_limiter.aacquire(blocking=False)
+        assert rate_limiter.available_tokens == 1
+        frozen_time.tick(0.9)
+        assert await rate_limiter.aacquire(blocking=False)
+        assert rate_limiter.available_tokens == 1
+
+
+def test_sync_wait_max_bucket_size() -> None:
+    with freeze_time("2023-01-01 00:00:00") as frozen_time:
+        rate_limiter = InMemoryRateLimiter(
+            requests_per_second=2, check_every_n_seconds=0.1, max_bucket_size=500
+        )
+        rate_limiter.last = time.time()
+        frozen_time.tick(100)  # Increment by 100 seconds
+        assert rate_limiter.acquire(blocking=False)
+        # After 100 seconds we manage to refill the bucket with 200 tokens
+        # After consuming 1 token, we should have 199 tokens left
+        assert rate_limiter.available_tokens == 199.0
+        frozen_time.tick(10000)
+        assert rate_limiter.acquire(blocking=False)
+        assert rate_limiter.available_tokens == 499.0
+        # Assert that sync wait can proceed without blocking
+        # since we have enough tokens
+        rate_limiter.acquire(blocking=True)
+
+
+async def test_async_wait_max_bucket_size() -> None:
+    with freeze_time("2023-01-01 00:00:00") as frozen_time:
+        rate_limiter = InMemoryRateLimiter(
+            requests_per_second=2, check_every_n_seconds=0.1, max_bucket_size=500
+        )
+        rate_limiter.last = time.time()
+        frozen_time.tick(100)  # Increment by 100 seconds
+        assert await rate_limiter.aacquire(blocking=False)
+        # After 100 seconds we manage to refill the bucket with 200 tokens
+        # After consuming 1 token, we should have 199 tokens left
+        assert rate_limiter.available_tokens == 199.0
+        frozen_time.tick(10000)
+        assert await rate_limiter.aacquire(blocking=False)
+        assert rate_limiter.available_tokens == 499.0
+        # Assert that sync wait can proceed without blocking
+        # since we have enough tokens
+        await rate_limiter.aacquire(blocking=True)
diff --git a/libs/core/tests/unit_tests/test_tools.py b/libs/core/tests/unit_tests/test_tools.py
index 7f0fccbd050ef..489feeb7dca55 100644
--- a/libs/core/tests/unit_tests/test_tools.py
+++ b/libs/core/tests/unit_tests/test_tools.py
@@ -977,6 +977,16 @@ async def _arun(self, bar: Any, bar_config: RunnableConfig, **kwargs: Any) -> An
 def test_tool_pass_config(tool: BaseTool) -> None:
     assert tool.invoke({"bar": "baz"}, {"configurable": {"foo": "not-bar"}}) == "baz"
 
+    # Test tool calls
+    tool_call = {
+        "name": tool.name,
+        "args": {"bar": "baz"},
+        "id": "abc123",
+        "type": "tool_call",
+    }
+    _ = tool.invoke(tool_call, {"configurable": {"foo": "not-bar"}})
+    assert tool_call["args"] == {"bar": "baz"}
+
 
 @pytest.mark.parametrize(
     "tool", [foo, afoo, simple_foo, asimple_foo, FooBase(), AFooBase()]
diff --git a/libs/core/tests/unit_tests/vectorstores/test_in_memory.py b/libs/core/tests/unit_tests/vectorstores/test_in_memory.py
index 057d5321f4e1e..8a3a3b6407d49 100644
--- a/libs/core/tests/unit_tests/vectorstores/test_in_memory.py
+++ b/libs/core/tests/unit_tests/vectorstores/test_in_memory.py
@@ -1,4 +1,5 @@
 from pathlib import Path
+from unittest.mock import AsyncMock, Mock
 
 import pytest
 from langchain_standard_tests.integration_tests.vectorstores import (
@@ -24,25 +25,39 @@ async def vectorstore(self) -> InMemoryVectorStore:
         return InMemoryVectorStore(embedding=self.get_embeddings())
 
 
-async def test_inmemory() -> None:
-    """Test end to end construction and search."""
+async def test_inmemory_similarity_search() -> None:
+    """Test end to end similarity search."""
     store = await InMemoryVectorStore.afrom_texts(
-        ["foo", "bar", "baz"], DeterministicFakeEmbedding(size=6)
+        ["foo", "bar", "baz"], DeterministicFakeEmbedding(size=3)
     )
-    output = await store.asimilarity_search("foo", k=1)
+
+    # Check sync version
+    output = store.similarity_search("foo", k=1)
     assert output == [Document(page_content="foo", id=AnyStr())]
 
+    # Check async version
     output = await store.asimilarity_search("bar", k=2)
     assert output == [
         Document(page_content="bar", id=AnyStr()),
         Document(page_content="baz", id=AnyStr()),
     ]
 
-    output2 = await store.asimilarity_search_with_score("bar", k=2)
-    assert output2[0][1] > output2[1][1]
+
+async def test_inmemory_similarity_search_with_score() -> None:
+    """Test end to end similarity search with score"""
+    store = await InMemoryVectorStore.afrom_texts(
+        ["foo", "bar", "baz"], DeterministicFakeEmbedding(size=3)
+    )
+
+    output = store.similarity_search_with_score("foo", k=1)
+    assert output[0][0].page_content == "foo"
+
+    output = await store.asimilarity_search_with_score("bar", k=2)
+    assert output[0][1] > output[1][1]
 
 
 async def test_add_by_ids() -> None:
+    """Test add texts with ids."""
     vectorstore = InMemoryVectorStore(embedding=DeterministicFakeEmbedding(size=6))
 
     # Check sync version
@@ -50,17 +65,25 @@ async def test_add_by_ids() -> None:
     assert ids1 == ["1", "2", "3"]
     assert sorted(vectorstore.store.keys()) == ["1", "2", "3"]
 
+    # Check async version
     ids2 = await vectorstore.aadd_texts(["foo", "bar", "baz"], ids=["4", "5", "6"])
     assert ids2 == ["4", "5", "6"]
     assert sorted(vectorstore.store.keys()) == ["1", "2", "3", "4", "5", "6"]
 
 
 async def test_inmemory_mmr() -> None:
+    """Test MMR search"""
     texts = ["foo", "foo", "fou", "foy"]
     docsearch = await InMemoryVectorStore.afrom_texts(
         texts, DeterministicFakeEmbedding(size=6)
     )
     # make sure we can k > docstore size
+    output = docsearch.max_marginal_relevance_search("foo", k=10, lambda_mult=0.1)
+    assert len(output) == len(texts)
+    assert output[0] == Document(page_content="foo", id=AnyStr())
+    assert output[1] == Document(page_content="foy", id=AnyStr())
+
+    # Check async version
     output = await docsearch.amax_marginal_relevance_search(
         "foo", k=10, lambda_mult=0.1
     )
@@ -85,13 +108,91 @@ async def test_inmemory_dump_load(tmp_path: Path) -> None:
 
 
 async def test_inmemory_filter() -> None:
-    """Test end to end construction and search."""
+    """Test end to end construction and search with filter."""
     store = await InMemoryVectorStore.afrom_texts(
         ["foo", "bar"],
         DeterministicFakeEmbedding(size=6),
         [{"id": 1}, {"id": 2}],
     )
+
+    # Check sync version
+    output = store.similarity_search("fee", filter=lambda doc: doc.metadata["id"] == 1)
+    assert output == [Document(page_content="foo", metadata={"id": 1}, id=AnyStr())]
+
+    # filter with not stored document id
     output = await store.asimilarity_search(
-        "baz", filter=lambda doc: doc.metadata["id"] == 1
+        "baz", filter=lambda doc: doc.metadata["id"] == 3
     )
-    assert output == [Document(page_content="foo", metadata={"id": 1}, id=AnyStr())]
+    assert output == []
+
+
+async def test_inmemory_upsert() -> None:
+    """Test upsert documents."""
+    embedding = DeterministicFakeEmbedding(size=2)
+    store = InMemoryVectorStore(embedding=embedding)
+
+    # Check sync version
+    store.upsert([Document(page_content="foo", id="1")])
+    assert sorted(store.store.keys()) == ["1"]
+
+    # Check async version
+    await store.aupsert([Document(page_content="bar", id="2")])
+    assert sorted(store.store.keys()) == ["1", "2"]
+
+    # update existing document
+    await store.aupsert(
+        [Document(page_content="baz", id="2", metadata={"metadata": "value"})]
+    )
+    item = store.store["2"]
+
+    baz_vector = embedding.embed_query("baz")
+    assert item == {
+        "id": "2",
+        "text": "baz",
+        "vector": baz_vector,
+        "metadata": {"metadata": "value"},
+    }
+
+
+async def test_inmemory_get_by_ids() -> None:
+    """Test get by ids."""
+
+    store = InMemoryVectorStore(embedding=DeterministicFakeEmbedding(size=3))
+
+    store.upsert(
+        [
+            Document(page_content="foo", id="1", metadata={"metadata": "value"}),
+            Document(page_content="bar", id="2"),
+            Document(page_content="baz", id="3"),
+        ],
+    )
+
+    # Check sync version
+    output = store.get_by_ids(["1", "2"])
+    assert output == [
+        Document(page_content="foo", id="1", metadata={"metadata": "value"}),
+        Document(page_content="bar", id="2"),
+    ]
+
+    # Check async version
+    output = await store.aget_by_ids(["1", "3", "5"])
+    assert output == [
+        Document(page_content="foo", id="1", metadata={"metadata": "value"}),
+        Document(page_content="baz", id="3"),
+    ]
+
+
+async def test_inmemory_call_embeddings_async() -> None:
+    embeddings_mock = Mock(
+        wraps=DeterministicFakeEmbedding(size=3),
+        aembed_documents=AsyncMock(),
+        aembed_query=AsyncMock(),
+    )
+    store = InMemoryVectorStore(embedding=embeddings_mock)
+
+    await store.aadd_texts("foo")
+    await store.asimilarity_search("foo", k=1)
+
+    # Ensure the async embedding function is called
+    assert embeddings_mock.aembed_documents.await_count == 1
+    assert embeddings_mock.aembed_query.await_count == 1
diff --git a/libs/experimental/langchain_experimental/llms/ollama_functions.py b/libs/experimental/langchain_experimental/llms/ollama_functions.py
index ba68a7f3eaa6a..abf6aa753dc9f 100644
--- a/libs/experimental/langchain_experimental/llms/ollama_functions.py
+++ b/libs/experimental/langchain_experimental/llms/ollama_functions.py
@@ -15,6 +15,7 @@
 )
 
 from langchain_community.chat_models.ollama import ChatOllama
+from langchain_core._api import deprecated
 from langchain_core.callbacks import (
     AsyncCallbackManagerForLLMRun,
     CallbackManagerForLLMRun,
@@ -132,6 +133,9 @@ def parse_response(message: BaseMessage) -> str:
     raise ValueError(f"`message` is not an instance of `AIMessage`: {message}")
 
 
+@deprecated(  # type: ignore[arg-type]
+    since="0.0.64", removal="0.4.0", alternative_import="langchain_ollama.ChatOllama"
+)
 class OllamaFunctions(ChatOllama):
     """Function chat model that uses Ollama API."""
 
diff --git a/libs/experimental/poetry.lock b/libs/experimental/poetry.lock
index 2a24215163bc4..059120264b786 100644
--- a/libs/experimental/poetry.lock
+++ b/libs/experimental/poetry.lock
@@ -1,4 +1,4 @@
-# This file is automatically @generated by Poetry 1.8.2 and should not be changed by hand.
+# This file is automatically @generated by Poetry 1.6.1 and should not be changed by hand.
 
 [[package]]
 name = "aiohttp"
@@ -1461,7 +1461,7 @@ files = [
 
 [[package]]
 name = "langchain"
-version = "0.2.6"
+version = "0.2.11"
 description = "Building applications with LLMs through composability"
 optional = false
 python-versions = ">=3.8.1,<4.0"
@@ -1471,7 +1471,7 @@ develop = true
 [package.dependencies]
 aiohttp = "^3.8.3"
 async-timeout = {version = "^4.0.0", markers = "python_version < \"3.11\""}
-langchain-core = "^0.2.10"
+langchain-core = "^0.2.23"
 langchain-text-splitters = "^0.2.0"
 langsmith = "^0.1.17"
 numpy = [
@@ -1490,7 +1490,7 @@ url = "../langchain"
 
 [[package]]
 name = "langchain-community"
-version = "0.2.6"
+version = "0.2.10"
 description = "Community contributed LangChain integrations."
 optional = false
 python-versions = ">=3.8.1,<4.0"
@@ -1500,8 +1500,8 @@ develop = true
 [package.dependencies]
 aiohttp = "^3.8.3"
 dataclasses-json = ">= 0.5.7, < 0.7"
-langchain = "^0.2.6"
-langchain-core = "^0.2.10"
+langchain = "^0.2.9"
+langchain-core = "^0.2.23"
 langsmith = "^0.1.0"
 numpy = [
     {version = ">=1,<2", markers = "python_version < \"3.12\""},
@@ -1518,7 +1518,7 @@ url = "../community"
 
 [[package]]
 name = "langchain-core"
-version = "0.2.11"
+version = "0.2.23"
 description = "Building applications with LLMs through composability"
 optional = false
 python-versions = ">=3.8.1,<4.0"
@@ -1542,7 +1542,7 @@ url = "../core"
 
 [[package]]
 name = "langchain-openai"
-version = "0.1.13"
+version = "0.1.18"
 description = "An integration package connecting OpenAI and LangChain"
 optional = false
 python-versions = ">=3.8.1,<4.0"
@@ -1550,7 +1550,7 @@ files = []
 develop = true
 
 [package.dependencies]
-langchain-core = ">=0.2.2,<0.3"
+langchain-core = "^0.2.20"
 openai = "^1.32.0"
 tiktoken = ">=0.7,<1"
 
@@ -2622,7 +2622,6 @@ files = [
     {file = "PyYAML-6.0.1-cp311-cp311-win_amd64.whl", hash = "sha256:bf07ee2fef7014951eeb99f56f39c9bb4af143d8aa3c21b1677805985307da34"},
     {file = "PyYAML-6.0.1-cp312-cp312-macosx_10_9_x86_64.whl", hash = "sha256:855fb52b0dc35af121542a76b9a84f8d1cd886ea97c84703eaa6d88e37a2ad28"},
     {file = "PyYAML-6.0.1-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:40df9b996c2b73138957fe23a16a4f0ba614f4c0efce1e9406a184b6d07fa3a9"},
-    {file = "PyYAML-6.0.1-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:a08c6f0fe150303c1c6b71ebcd7213c2858041a7e01975da3a99aed1e7a378ef"},
     {file = "PyYAML-6.0.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:6c22bec3fbe2524cde73d7ada88f6566758a8f7227bfbf93a408a9d86bcc12a0"},
     {file = "PyYAML-6.0.1-cp312-cp312-musllinux_1_1_x86_64.whl", hash = "sha256:8d4e9c88387b0f5c7d5f281e55304de64cf7f9c0021a3525bd3b1c542da3b0e4"},
     {file = "PyYAML-6.0.1-cp312-cp312-win32.whl", hash = "sha256:d483d2cdf104e7c9fa60c544d92981f12ad66a457afae824d146093b8c294c54"},
@@ -3704,4 +3703,4 @@ test = ["big-O", "importlib-resources", "jaraco.functools", "jaraco.itertools",
 [metadata]
 lock-version = "2.0"
 python-versions = ">=3.8.1,<4.0"
-content-hash = "1c7a8eae7e62464f7bd2eb9bda374c221ec0eb5c286aa61526c0d76db9326aed"
+content-hash = "dec43275b986de6578d6cd1236d077a7c74441d172319762d452d6b4a5db094c"
diff --git a/libs/experimental/pyproject.toml b/libs/experimental/pyproject.toml
index cdf6f660287a6..0084b16640852 100644
--- a/libs/experimental/pyproject.toml
+++ b/libs/experimental/pyproject.toml
@@ -22,7 +22,7 @@ exclude = [ "notebooks", "examples", "example_data",]
 
 [tool.poetry.dependencies]
 python = ">=3.8.1,<4.0"
-langchain-core = "^0.2.10"
+langchain-core = "^0.2.23"
 langchain-community = "^0.2.6"
 
 [tool.ruff.lint]
diff --git a/libs/experimental/tests/unit_tests/test_imports.py b/libs/experimental/tests/unit_tests/test_imports.py
index 8db98f5c2de8b..7da7cb3f8fc1a 100644
--- a/libs/experimental/tests/unit_tests/test_imports.py
+++ b/libs/experimental/tests/unit_tests/test_imports.py
@@ -1,15 +1,46 @@
-import glob
 import importlib
 from pathlib import Path
 
+PKG_ROOT = Path(__file__).parent.parent.parent
+PKG_CODE = PKG_ROOT / "langchain_experimental"
+
 
 def test_importable_all() -> None:
-    for path in glob.glob("../experimental/langchain_experimental/*"):
-        relative_path = Path(path).parts[-1]
+    """Test that all modules in langchain_experimental are importable."""
+    failures = []
+    found_at_least_one = False
+    for path in PKG_CODE.rglob("*.py"):
+        relative_path = str(Path(path).relative_to(PKG_CODE)).replace("/", ".")
         if relative_path.endswith(".typed"):
             continue
-        module_name = relative_path.split(".")[0]
-        module = importlib.import_module("langchain_experimental." + module_name)
+        if relative_path.endswith("/__init__.py"):
+            # Then strip __init__.py
+            s = "/__init__.py"
+            module_name = relative_path[: -len(s)]
+        else:  # just strip .py
+            module_name = relative_path[:-3]
+
+        if not module_name:
+            continue
+        try:
+            module = importlib.import_module("langchain_experimental." + module_name)
+        except ImportError:
+            failures.append("langchain_experimental." + module_name)
+            continue
+
         all_ = getattr(module, "__all__", [])
         for cls_ in all_:
-            getattr(module, cls_)
+            try:
+                getattr(module, cls_)
+            except AttributeError:
+                failures.append(f"{module_name}.{cls_}")
+
+        found_at_least_one = True
+
+    if failures:
+        raise AssertionError(
+            "The following modules or classes could not be imported: "
+            + ", ".join(failures)
+        )
+
+    assert found_at_least_one is True
diff --git a/libs/partners/groq/langchain_groq/chat_models.py b/libs/partners/groq/langchain_groq/chat_models.py
index 661ca4b5919ed..c7bed62bb803e 100644
--- a/libs/partners/groq/langchain_groq/chat_models.py
+++ b/libs/partners/groq/langchain_groq/chat_models.py
@@ -95,13 +95,6 @@ class ChatGroq(BaseChatModel):
     Any parameters that are valid to be passed to the groq.create call
     can be passed in, even if not explicitly saved on this class.
 
-    Example:
-        .. code-block:: python
-
-            from langchain_groq import ChatGroq
-
-            model = ChatGroq(model_name="mixtral-8x7b-32768")
-
     Setup:
         Install ``langchain-groq`` and set environment variable
         ``GROQ_API_KEY``.
@@ -143,12 +136,12 @@ class ChatGroq(BaseChatModel):
 
             from langchain_groq import ChatGroq
 
-            model = ChatGroq(
+            llm = ChatGroq(
                 model="mixtral-8x7b-32768",
                 temperature=0.0,
                 max_retries=2,
                 # other params...
-                )
+            )
 
     Invoke:
         .. code-block:: python
@@ -158,7 +151,7 @@ class ChatGroq(BaseChatModel):
                 sentence to French."),
                 ("human", "I love programming."),
             ]
-            model.invoke(messages)
+            llm.invoke(messages)
 
         .. code-block:: python
 
@@ -175,7 +168,7 @@ class ChatGroq(BaseChatModel):
     Stream:
         .. code-block:: python
 
-            for chunk in model.stream(messages):
+            for chunk in llm.stream(messages):
                 print(chunk)
 
         .. code-block:: python
@@ -192,7 +185,7 @@ class ChatGroq(BaseChatModel):
 
         .. code-block:: python
 
-            stream = model.stream(messages)
+            stream = llm.stream(messages)
             full = next(stream)
             for chunk in stream:
                 full += chunk
@@ -215,7 +208,7 @@ class ChatGroq(BaseChatModel):
     Async:
         .. code-block:: python
 
-            await model.ainvoke(messages)
+            await llm.ainvoke(messages)
 
         .. code-block:: python
 
@@ -247,7 +240,7 @@ class GetPopulation(BaseModel):
                 location: str = Field(..., description="The city and state,
                 e.g. San Francisco, CA")
 
-            model_with_tools = model.bind_tools([GetWeather, GetPopulation])
+            model_with_tools = llm.bind_tools([GetWeather, GetPopulation])
             ai_msg = model_with_tools.invoke("What is the population of NY?")
             ai_msg.tool_calls
 
@@ -274,7 +267,7 @@ class Joke(BaseModel):
                 rating: Optional[int] = Field(description="How funny the joke
                 is, from 1 to 10")
 
-            structured_model = model.with_structured_output(Joke)
+            structured_model = llm.with_structured_output(Joke)
             structured_model.invoke("Tell me a joke about cats")
 
         .. code-block:: python
@@ -287,7 +280,7 @@ class Joke(BaseModel):
     Response metadata
         .. code-block:: python
 
-            ai_msg = model.invoke(messages)
+            ai_msg = llm.invoke(messages)
             ai_msg.response_metadata
 
         .. code-block:: python
diff --git a/libs/partners/mistralai/langchain_mistralai/chat_models.py b/libs/partners/mistralai/langchain_mistralai/chat_models.py
index 493ba81c9dd59..9b3fb816bbd60 100644
--- a/libs/partners/mistralai/langchain_mistralai/chat_models.py
+++ b/libs/partners/mistralai/langchain_mistralai/chat_models.py
@@ -1,7 +1,9 @@
 from __future__ import annotations
 
+import hashlib
 import json
 import logging
+import re
 import uuid
 from operator import itemgetter
 from typing import (
@@ -77,6 +79,9 @@
 
 logger = logging.getLogger(__name__)
 
+# Mistral enforces a specific pattern for tool call IDs
+TOOL_CALL_ID_PATTERN = re.compile(r"^[a-zA-Z0-9]{9}$")
+
 
 def _create_retry_decorator(
     llm: ChatMistralAI,
@@ -92,6 +97,39 @@ def _create_retry_decorator(
     )
 
 
+def _is_valid_mistral_tool_call_id(tool_call_id: str) -> bool:
+    """Check if tool call ID is nine character string consisting of a-z, A-Z, 0-9"""
+    return bool(TOOL_CALL_ID_PATTERN.match(tool_call_id))
+
+
+def _base62_encode(num: int) -> str:
+    """Encodes a number in base62 and ensures result is of a specified length."""
+    base62 = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"
+    if num == 0:
+        return base62[0]
+    arr = []
+    base = len(base62)
+    while num:
+        num, rem = divmod(num, base)
+        arr.append(base62[rem])
+    arr.reverse()
+    return "".join(arr)
+
+
+def _convert_tool_call_id_to_mistral_compatible(tool_call_id: str) -> str:
+    """Convert a tool call ID to a Mistral-compatible format"""
+    if _is_valid_mistral_tool_call_id(tool_call_id):
+        return tool_call_id
+    else:
+        hash_bytes = hashlib.sha256(tool_call_id.encode()).digest()
+        hash_int = int.from_bytes(hash_bytes, byteorder="big")
+        base62_str = _base62_encode(hash_int)
+        if len(base62_str) >= 9:
+            return base62_str[:9]
+        else:
+            return base62_str.rjust(9, "0")
+
+
 def _convert_mistral_chat_message_to_message(
     _message: Dict,
 ) -> BaseMessage:
@@ -246,7 +284,7 @@ def _format_tool_call_for_mistral(tool_call: ToolCall) -> dict:
         }
     }
     if _id := tool_call.get("id"):
-        result["id"] = _id
+        result["id"] = _convert_tool_call_id_to_mistral_compatible(_id)
 
     return result
 
@@ -260,7 +298,7 @@ def _format_invalid_tool_call_for_mistral(invalid_tool_call: InvalidToolCall) ->
         }
     }
     if _id := invalid_tool_call.get("id"):
-        result["id"] = _id
+        result["id"] = _convert_tool_call_id_to_mistral_compatible(_id)
 
     return result
 
diff --git a/libs/partners/mistralai/tests/unit_tests/test_chat_models.py b/libs/partners/mistralai/tests/unit_tests/test_chat_models.py
index 011bb0fa4ff88..35b21af8a156e 100644
--- a/libs/partners/mistralai/tests/unit_tests/test_chat_models.py
+++ b/libs/partners/mistralai/tests/unit_tests/test_chat_models.py
@@ -21,6 +21,8 @@
     ChatMistralAI,
     _convert_message_to_mistral_chat_message,
     _convert_mistral_chat_message_to_message,
+    _convert_tool_call_id_to_mistral_compatible,
+    _is_valid_mistral_tool_call_id,
 )
 
 os.environ["MISTRAL_API_KEY"] = "foo"
@@ -128,7 +130,7 @@ async def test_astream_with_callback() -> None:
 
 def test__convert_dict_to_message_tool_call() -> None:
     raw_tool_call = {
-        "id": "abc123",
+        "id": "ssAbar4Dr",
         "function": {
             "arguments": '{"name": "Sally", "hair_color": "green"}',
             "name": "GenerateUsername",
@@ -143,7 +145,7 @@ def test__convert_dict_to_message_tool_call() -> None:
             ToolCall(
                 name="GenerateUsername",
                 args={"name": "Sally", "hair_color": "green"},
-                id="abc123",
+                id="ssAbar4Dr",
                 type="tool_call",
             )
         ],
@@ -154,14 +156,14 @@ def test__convert_dict_to_message_tool_call() -> None:
     # Test malformed tool call
     raw_tool_calls = [
         {
-            "id": "def456",
+            "id": "pL5rEGzxe",
             "function": {
                 "arguments": '{"name": "Sally", "hair_color": "green"}',
                 "name": "GenerateUsername",
             },
         },
         {
-            "id": "abc123",
+            "id": "ssAbar4Dr",
             "function": {
                 "arguments": "oops",
                 "name": "GenerateUsername",
@@ -178,7 +180,7 @@ def test__convert_dict_to_message_tool_call() -> None:
                 name="GenerateUsername",
                 args="oops",
                 error="Function GenerateUsername arguments:\n\noops\n\nare not valid JSON. Received JSONDecodeError Expecting value: line 1 column 1 (char 0)",  # noqa: E501
-                id="abc123",
+                id="ssAbar4Dr",
                 type="invalid_tool_call",
             ),
         ],
@@ -186,7 +188,7 @@ def test__convert_dict_to_message_tool_call() -> None:
             ToolCall(
                 name="GenerateUsername",
                 args={"name": "Sally", "hair_color": "green"},
-                id="def456",
+                id="pL5rEGzxe",
                 type="tool_call",
             ),
         ],
@@ -201,3 +203,18 @@ def token_encoder(text: str) -> List[int]:
 
     llm = ChatMistralAI(custom_get_token_ids=token_encoder)
     assert llm.get_token_ids("foo") == [1, 2, 3]
+
+
+def test_tool_id_conversion() -> None:
+    assert _is_valid_mistral_tool_call_id("ssAbar4Dr")
+    assert not _is_valid_mistral_tool_call_id("abc123")
+    assert not _is_valid_mistral_tool_call_id("call_JIIjI55tTipFFzpcP8re3BpM")
+
+    result_map = {
+        "ssAbar4Dr": "ssAbar4Dr",
+        "abc123": "pL5rEGzxe",
+        "call_JIIjI55tTipFFzpcP8re3BpM": "8kxAQvoED",
+    }
+    for input_id, expected_output in result_map.items():
+        assert _convert_tool_call_id_to_mistral_compatible(input_id) == expected_output
+        assert _is_valid_mistral_tool_call_id(expected_output)
diff --git a/libs/partners/together/langchain_together/chat_models.py b/libs/partners/together/langchain_together/chat_models.py
index a3c5d604a6217..ee3473eba4010 100644
--- a/libs/partners/together/langchain_together/chat_models.py
+++ b/libs/partners/together/langchain_together/chat_models.py
@@ -19,19 +19,255 @@
 
 
 class ChatTogether(BaseChatOpenAI):
-    """ChatTogether chat model.
+    r"""ChatTogether chat model.
 
-    To use, you should have the environment variable `TOGETHER_API_KEY`
-    set with your API key or pass it as a named parameter to the constructor.
+    Setup:
+        Install ``langchain-together`` and set environment variable ``TOGETHER_API_KEY``.
 
-    Example:
+        .. code-block:: bash
+
+            pip install -U langchain-together
+            export TOGETHER_API_KEY="your-api-key"
+
+
+    Key init args — completion params:
+        model: str
+            Name of model to use.
+        temperature: float
+            Sampling temperature.
+        max_tokens: Optional[int]
+            Max number of tokens to generate.
+        logprobs: Optional[bool]
+            Whether to return logprobs.
+
+    Key init args — client params:
+        timeout: Union[float, Tuple[float, float], Any, None]
+            Timeout for requests.
+        max_retries: int
+            Max number of retries.
+        api_key: Optional[str]
+            Together API key. If not passed in will be read from env var OPENAI_API_KEY.
+
+    Instantiate:
+        .. code-block:: python
+
+            from langhcain_together import ChatTogether
+
+            llm = ChatTogether(
+                model="meta-llama/Llama-3-70b-chat-hf",
+                temperature=0,
+                max_tokens=None,
+                timeout=None,
+                max_retries=2,
+                # api_key="...",
+                # other params...
+            )
+
+    Invoke:
+        .. code-block:: python
+
+            messages = [
+                (
+                    "system",
+                    "You are a helpful translator. Translate the user sentence to French.",
+                ),
+                ("human", "I love programming."),
+            ]
+            llm.invoke(messages)
+
+        .. code-block:: python
+
+            AIMessage(
+                content="J'adore la programmation.",
+                response_metadata={
+                    'token_usage': {'completion_tokens': 9, 'prompt_tokens': 32, 'total_tokens': 41},
+                    'model_name': 'meta-llama/Llama-3-70b-chat-hf',
+                    'system_fingerprint': None,
+                    'finish_reason': 'stop',
+                    'logprobs': None
+                },
+                id='run-168dceca-3b8b-4283-94e3-4c739dbc1525-0',
+                usage_metadata={'input_tokens': 32, 'output_tokens': 9, 'total_tokens': 41})
+
+    Stream:
+        .. code-block:: python
+
+            for chunk in llm.stream(messages):
+                print(chunk)
+
+        .. code-block:: python
+
+            content='J' id='run-1bc996b5-293f-4114-96a1-e0f755c05eb9'
+            content="'" id='run-1bc996b5-293f-4114-96a1-e0f755c05eb9'
+            content='ad' id='run-1bc996b5-293f-4114-96a1-e0f755c05eb9'
+            content='ore' id='run-1bc996b5-293f-4114-96a1-e0f755c05eb9'
+            content=' la' id='run-1bc996b5-293f-4114-96a1-e0f755c05eb9'
+            content=' programm' id='run-1bc996b5-293f-4114-96a1-e0f755c05eb9'
+            content='ation' id='run-1bc996b5-293f-4114-96a1-e0f755c05eb9'
+            content='.' id='run-1bc996b5-293f-4114-96a1-e0f755c05eb9'
+            content='' response_metadata={'finish_reason': 'stop', 'model_name': 'meta-llama/Llama-3-70b-chat-hf'} id='run-1bc996b5-293f-4114-96a1-e0f755c05eb9'
+
+
+    Async:
+        .. code-block:: python
+
+            await llm.ainvoke(messages)
+
+            # stream:
+            # async for chunk in (await llm.astream(messages))
+
+            # batch:
+            # await llm.abatch([messages])
+
+        .. code-block:: python
+
+            AIMessage(
+                content="J'adore la programmation.",
+                response_metadata={
+                    'token_usage': {'completion_tokens': 9, 'prompt_tokens': 32, 'total_tokens': 41},
+                    'model_name': 'meta-llama/Llama-3-70b-chat-hf',
+                    'system_fingerprint': None,
+                    'finish_reason': 'stop',
+                    'logprobs': None
+                },
+                id='run-09371a11-7f72-4c53-8e7c-9de5c238b34c-0',
+                usage_metadata={'input_tokens': 32, 'output_tokens': 9, 'total_tokens': 41})
+
+    Tool calling:
+        .. code-block:: python
+
+            from langchain_core.pydantic_v1 import BaseModel, Field
+
+            # Only certain models support tool calling, check the together website to confirm compatibility
+            llm = ChatTogether(model="mistralai/Mixtral-8x7B-Instruct-v0.1")
+
+            class GetWeather(BaseModel):
+                '''Get the current weather in a given location'''
+
+                location: str = Field(
+                    ..., description="The city and state, e.g. San Francisco, CA"
+                )
+
+            class GetPopulation(BaseModel):
+                '''Get the current population in a given location'''
+
+                location: str = Field(
+                    ..., description="The city and state, e.g. San Francisco, CA"
+                )
+
+            llm_with_tools = llm.bind_tools([GetWeather, GetPopulation])
+            ai_msg = llm_with_tools.invoke(
+                "Which city is bigger: LA or NY?"
+            )
+            ai_msg.tool_calls
+
+
+        .. code-block:: python
+
+            [
+                {
+                    'name': 'GetPopulation',
+                    'args': {'location': 'NY'},
+                    'id': 'call_m5tstyn2004pre9bfuxvom8x',
+                    'type': 'tool_call'
+                },
+                {
+                    'name': 'GetPopulation',
+                    'args': {'location': 'LA'},
+                    'id': 'call_0vjgq455gq1av5sp9eb1pw6a',
+                    'type': 'tool_call'
+                }
+            ]
+
+    Structured output:
+        .. code-block:: python
+
+            from typing import Optional
+
+            from langchain_core.pydantic_v1 import BaseModel, Field
+
+
+            class Joke(BaseModel):
+                '''Joke to tell user.'''
+
+                setup: str = Field(description="The setup of the joke")
+                punchline: str = Field(description="The punchline to the joke")
+                rating: Optional[int] = Field(description="How funny the joke is, from 1 to 10")
+
+
+            structured_llm = llm.with_structured_output(Joke)
+            structured_llm.invoke("Tell me a joke about cats")
+
+        .. code-block:: python
+
+            Joke(
+                setup='Why was the cat sitting on the computer?',
+                punchline='To keep an eye on the mouse!',
+                rating=7
+            )
+
+    JSON mode:
+        .. code-block:: python
+
+            json_llm = llm.bind(response_format={"type": "json_object"})
+            ai_msg = json_llm.invoke(
+                "Return a JSON object with key 'random_ints' and a value of 10 random ints in [0-99]"
+            )
+            ai_msg.content
+
+        .. code-block:: python
+
+            ' {\\n"random_ints": [\\n13,\\n54,\\n78,\\n45,\\n67,\\n90,\\n11,\\n29,\\n84,\\n33\\n]\\n}'
+
+    Token usage:
+        .. code-block:: python
+
+            ai_msg = llm.invoke(messages)
+            ai_msg.usage_metadata
+
+        .. code-block:: python
+
+            {'input_tokens': 37, 'output_tokens': 6, 'total_tokens': 43}
+
+    Logprobs:
         .. code-block:: python
 
-            from langchain_together import ChatTogether
+            logprobs_llm = llm.bind(logprobs=True)
+            messages=[("human","Say Hello World! Do not return anything else.")]
+            ai_msg = logprobs_llm.invoke(messages)
+            ai_msg.response_metadata["logprobs"]
+
+        .. code-block:: python
+
+            {
+                'content': None,
+                'token_ids': [22557, 3304, 28808, 2],
+                'tokens': [' Hello', ' World', '!', '</s>'],
+                'token_logprobs': [-4.7683716e-06, -5.9604645e-07, 0, -0.057373047]
+            }
+
+
+    Response metadata
+        .. code-block:: python
+
+            ai_msg = llm.invoke(messages)
+            ai_msg.response_metadata
+
+        .. code-block:: python
 
+            {
+                'token_usage': {
+                    'completion_tokens': 4,
+                    'prompt_tokens': 19,
+                    'total_tokens': 23
+                    },
+                'model_name': 'mistralai/Mixtral-8x7B-Instruct-v0.1',
+                'system_fingerprint': None,
+                'finish_reason': 'eos',
+                'logprobs': None
+            }
 
-            model = ChatTogether()
-    """
+    """  # noqa: E501
 
     @property
     def lc_secrets(self) -> Dict[str, str]: