Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mistral[minor]: Added Retrying Mechanism in case of Request Rate Limit Error for MistralAIEmbeddings #27818

Merged
merged 10 commits into from
Dec 11, 2024

Conversation

keenborder786
Copy link
Contributor

  • Description:: In the event of a Rate Limit Error from the MistralAI server, the response JSON raises a KeyError. To address this, a simple retry mechanism has been implemented to handle cases where the request limit is exceeded.
  • Issue: Cannot create MistralAI embeddings from pdf or urls #27790

@dosubot dosubot bot added the size:S This PR changes 10-29 lines, ignoring generated files. label Nov 1, 2024
Copy link

vercel bot commented Nov 1, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Skipped Deployment
Name Status Preview Comments Updated (UTC)
langchain ⬜️ Ignored (Inspect) Visit Preview Dec 11, 2024 9:42pm

@dosubot dosubot bot added Ɑ: embeddings Related to text embedding models module 🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature labels Nov 1, 2024
@keenborder786
Copy link
Contributor Author

@eyurtsev

batch_responses = []

@retry(
retry=retry_if_exception_type(Exception),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally, retries should never be implemented on 4xx errors (except for 408 and 429). e.g., 403 should not be retried by default

  • What do we do in other parts of the code? Perhaps there's a better example that can be adopted?
  • What do other models do in the code base in terms of exposing the retry parameters so users can adjust? (e.g.,what if someone wants to have the first retry after 1 second rather than 30 seconds?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • @eyurtsev it is a 429 error therefore retrying makes sense.
  • I was not able to find any related example.
  • Yes I can make the wait and stop seconds as parameters.

for batch in self._get_batches(texts)
)
if response.status_code == 429:
raise Exception("Requests rate limit exceeded")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code takes an exception that was good and informative and turns it into one that's a broad Exception of type Exception -- this is usually not a good pattern for exception handling. Stack trace will be partially lost, the exception type is less specific etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Their were no specific exception being raised to being with. But I can change it from a general Exception.

Copy link
Collaborator

@eyurtsev eyurtsev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to add retry mechanism. Added a few questions to see if we can improve how it's configured

@keenborder786
Copy link
Contributor Author

@eyurtsev please check now

@keenborder786
Copy link
Contributor Author

@eyurtsev

@keenborder786
Copy link
Contributor Author

@eyurtsev

@keenborder786
Copy link
Contributor Author

@eyurtsev this is really important, please give feedback if needed

@keenborder786
Copy link
Contributor Author

@eyurtsev

@keenborder786
Copy link
Contributor Author

@eyurtsev

1 similar comment
@keenborder786
Copy link
Contributor Author

@eyurtsev

url="/embeddings",
json=dict(
model=self.model,
input=batch,
),
)
for batch in self._get_batches(texts)
)
if response.status_code == 429:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any reason not to use raise_for_status? We're trying not to drop the original exception which might have useful information inside it

batch_responses = []

@retry(
retry=retry_if_exception_type(httpx.TimeoutException),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(no need to change if you don't want) This is OK b/c it's probably the dominant failure mode

But it's very common to retry 5xx errors as well. And 408 (request timeout)

@dosubot dosubot bot added the lgtm PR looks good. Use to confirm that a PR is ready for merging. label Dec 11, 2024
@eyurtsev
Copy link
Collaborator

@keenborder786 sorry for how long it took. Pushed a minor change to raise on error, so originally information isn't lost

@eyurtsev eyurtsev changed the title Added Retrying Mechanism in case of Request Rate Limit Error for MistralAIEmbeddings mistral[minor]: Added Retrying Mechanism in case of Request Rate Limit Error for MistralAIEmbeddings Dec 11, 2024
@eyurtsev eyurtsev merged commit a37afbe into langchain-ai:master Dec 11, 2024
30 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature Ɑ: embeddings Related to text embedding models module lgtm PR looks good. Use to confirm that a PR is ready for merging. size:S This PR changes 10-29 lines, ignoring generated files.
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

2 participants