Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ChatNVIDIA request timeout #121

Open
joshua-pappalardo opened this issue Dec 5, 2024 · 1 comment
Open

ChatNVIDIA request timeout #121

joshua-pappalardo opened this issue Dec 5, 2024 · 1 comment

Comments

@joshua-pappalardo
Copy link

joshua-pappalardo commented Dec 5, 2024

I am encountering an issue where the llm is required to respond within 30 seconds or else ChatNVIDIA throws a 504 after 30 seconds have elapsed without a response. It would be nice to have a parameter to influence how long to wait prior to timing out. I have searched for a way to set a global timeout for requests.Session but it appears that the only way to override the timeout is to pass the timeout to the requests.Session.post method directly.

Here is the tail end of the relevant logs
`gsac-api-1 | File "/usr/local/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py", line 286, in invoke

gsac-api-1 | self.generate_prompt(

gsac-api-1 | File "/usr/local/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py", line 786, in generate_prompt

gsac-api-1 | return self.generate(prompt_messages, stop=stop, callbacks=callbacks, **kwargs)

gsac-api-1 | File "/usr/local/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py", line 643, in generate

gsac-api-1 | raise e

gsac-api-1 | File "/usr/local/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py", line 633, in generate

gsac-api-1 | self._generate_with_cache(

gsac-api-1 | File "/usr/local/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py", line 851, in _generate_with_cache

gsac-api-1 | result = self._generate(

gsac-api-1 | File "/usr/local/lib/python3.10/site-packages/langchain_nvidia_ai_endpoints/chat_models.py", line 382, in _generate

gsac-api-1 | response = self._client.get_req(payload=payload, extra_headers=extra_headers)

gsac-api-1 | File "/usr/local/lib/python3.10/site-packages/langchain_nvidia_ai_endpoints/_common.py", line 473, in get_req

gsac-api-1 | response, session = self._post(

gsac-api-1 | File "/usr/local/lib/python3.10/site-packages/langchain_nvidia_ai_endpoints/_common.py", line 369, in _post

gsac-api-1 | self._try_raise(response)

gsac-api-1 | File "/usr/local/lib/python3.10/site-packages/langchain_nvidia_ai_endpoints/_common.py", line 462, in _try_raise

gsac-api-1 | raise Exception(f"{header}\n{body}") from None

gsac-api-1 | Exception: [504] Gateway Time-out

gsac-api-1 | {'_content': b"

504 Gateway Time-out

\nThe server didn't respond in time.\n\n", '_content_consumed': True, '_next': None, 'status_code': 504, 'headers': {'content-length': '92', 'cache-control': 'no-cache', 'content-type': 'text/html'}, 'raw': <urllib3.response.HTTPResponse object at 0xffff28f09240>, 'url': 'https://llm-atcpoc0026165.apps.ocp-glue01.pg.wwtatc.ai/v1/chat/completions', 'encoding': 'ISO-8859-1', 'history': [], 'reason': 'Gateway Time-out', 'cookies': <RequestsCookieJar[]>, 'elapsed': datetime.timedelta(seconds=30, microseconds=290451), 'request': <PreparedRequest [POST]>, 'connection': <requests.adapters.HTTPAdapter object at 0xffff28f0b280>}`

@mattf
Copy link
Collaborator

mattf commented Dec 6, 2024

@joshua-pappalardo do you have a gateway between your client and the llm? can you adjust that gateway's timeout to give the llm more time to respond?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants