ChatNVIDIA request timeout #121

joshua-pappalardo · 2024-12-05T22:59:22Z

I am encountering an issue where the llm is required to respond within 30 seconds or else ChatNVIDIA throws a 504 after 30 seconds have elapsed without a response. It would be nice to have a parameter to influence how long to wait prior to timing out. I have searched for a way to set a global timeout for requests.Session but it appears that the only way to override the timeout is to pass the timeout to the requests.Session.post method directly.

Here is the tail end of the relevant logs
`gsac-api-1 | File "/usr/local/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py", line 286, in invoke

gsac-api-1 | self.generate_prompt(

gsac-api-1 | File "/usr/local/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py", line 786, in generate_prompt

gsac-api-1 | return self.generate(prompt_messages, stop=stop, callbacks=callbacks, **kwargs)

gsac-api-1 | File "/usr/local/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py", line 643, in generate

gsac-api-1 | raise e

gsac-api-1 | File "/usr/local/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py", line 633, in generate

gsac-api-1 | self._generate_with_cache(

gsac-api-1 | File "/usr/local/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py", line 851, in _generate_with_cache

gsac-api-1 | result = self._generate(

gsac-api-1 | File "/usr/local/lib/python3.10/site-packages/langchain_nvidia_ai_endpoints/chat_models.py", line 382, in _generate

gsac-api-1 | response = self._client.get_req(payload=payload, extra_headers=extra_headers)

gsac-api-1 | File "/usr/local/lib/python3.10/site-packages/langchain_nvidia_ai_endpoints/_common.py", line 473, in get_req

gsac-api-1 | response, session = self._post(

gsac-api-1 | File "/usr/local/lib/python3.10/site-packages/langchain_nvidia_ai_endpoints/_common.py", line 369, in _post

gsac-api-1 | self._try_raise(response)

gsac-api-1 | File "/usr/local/lib/python3.10/site-packages/langchain_nvidia_ai_endpoints/_common.py", line 462, in _try_raise

gsac-api-1 | raise Exception(f"{header}\n{body}") from None

gsac-api-1 | Exception: [504] Gateway Time-out

gsac-api-1 | {'_content': b"

504 Gateway Time-out

\nThe server didn't respond in time.\n\n", '_content_consumed': True, '_next': None, 'status_code': 504, 'headers': {'content-length': '92', 'cache-control': 'no-cache', 'content-type': 'text/html'}, 'raw': <urllib3.response.HTTPResponse object at 0xffff28f09240>, 'url': 'https://llm-atcpoc0026165.apps.ocp-glue01.pg.wwtatc.ai/v1/chat/completions', 'encoding': 'ISO-8859-1', 'history': [], 'reason': 'Gateway Time-out', 'cookies': <RequestsCookieJar[]>, 'elapsed': datetime.timedelta(seconds=30, microseconds=290451), 'request': <PreparedRequest [POST]>, 'connection': <requests.adapters.HTTPAdapter object at 0xffff28f0b280>}`

mattf · 2024-12-06T13:35:19Z

@joshua-pappalardo do you have a gateway between your client and the llm? can you adjust that gateway's timeout to give the llm more time to respond?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ChatNVIDIA request timeout #121

ChatNVIDIA request timeout #121

joshua-pappalardo commented Dec 5, 2024 •

edited

Loading

mattf commented Dec 6, 2024

ChatNVIDIA request timeout #121

ChatNVIDIA request timeout #121

Comments

joshua-pappalardo commented Dec 5, 2024 • edited Loading

504 Gateway Time-out

mattf commented Dec 6, 2024

joshua-pappalardo commented Dec 5, 2024 •

edited

Loading