-
Notifications
You must be signed in to change notification settings - Fork 16k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mistral[minor]: Added Retrying Mechanism in case of Request Rate Limit Error for MistralAIEmbeddings
#27818
mistral[minor]: Added Retrying Mechanism in case of Request Rate Limit Error for MistralAIEmbeddings
#27818
Changes from 1 commit
4fcc456
839bbfe
91c8925
a67baf1
320d141
833f267
6562510
86d6c63
d18f5c8
cb94661
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -4,6 +4,7 @@ | |
from typing import Iterable, List | ||
|
||
import httpx | ||
from httpx import Response | ||
from langchain_core.embeddings import Embeddings | ||
from langchain_core.utils import ( | ||
secret_from_env, | ||
|
@@ -15,6 +16,7 @@ | |
SecretStr, | ||
model_validator, | ||
) | ||
from tenacity import retry, retry_if_exception_type, stop_after_attempt, wait_fixed | ||
from tokenizers import Tokenizer # type: ignore | ||
from typing_extensions import Self | ||
|
||
|
@@ -209,16 +211,27 @@ def embed_documents(self, texts: List[str]) -> List[List[float]]: | |
List of embeddings, one for each text. | ||
""" | ||
try: | ||
batch_responses = ( | ||
self.client.post( | ||
batch_responses = [] | ||
|
||
@retry( | ||
retry=retry_if_exception_type(Exception), | ||
wait=wait_fixed(30), # Wait 30 seconds between retries | ||
stop=stop_after_attempt(5), # Stop after 5 attempts | ||
) | ||
def _embed_batch(batch: List[str]) -> Response: | ||
response = self.client.post( | ||
url="/embeddings", | ||
json=dict( | ||
model=self.model, | ||
input=batch, | ||
), | ||
) | ||
for batch in self._get_batches(texts) | ||
) | ||
if response.status_code == 429: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. any reason not to use |
||
raise Exception("Requests rate limit exceeded") | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This code takes an exception that was good and informative and turns it into one that's a broad Exception of type Exception -- this is usually not a good pattern for exception handling. Stack trace will be partially lost, the exception type is less specific etc. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
return response | ||
|
||
for batch in self._get_batches(texts): | ||
batch_responses.append(_embed_batch(batch)) | ||
return [ | ||
list(map(float, embedding_obj["embedding"])) | ||
for response in batch_responses | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally, retries should never be implemented on 4xx errors (except for 408 and 429). e.g., 403 should not be retried by default
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.