[Bug] Context window is too large #323

coreation · 2024-03-15T10:30:48Z

Is this a new bug?

I believe this is a new bug
I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

I provide a window of context to the canopy chat, but it appears that it doesn't reduce that window sufficiently.
I'm using chat_engine.chat(messages=messages, stream=False, namespace="namespace"), are we expected to truncate the messages ourselves to a certain degree?

The output of the error

INFO:     127.0.0.1:36750 - "POST /chat HTTP/1.1" 500 Internal Server Error
ERROR:    Exception in ASGI application
Traceback (most recent call last):
  File "/path/python3.9/site-packages/uvicorn/protocols/http/h11_impl.py", line 408, in run_asgi
    result = await app(  # type: ignore[func-returns-value]
  File "/path/python3.9/site-packages/uvicorn/middleware/proxy_headers.py", line 84, in __call__
    return await self.app(scope, receive, send)
  File "/path/python3.9/site-packages/fastapi/applications.py", line 292, in __call__
    await super().__call__(scope, receive, send)
  File "/path/python3.9/site-packages/starlette/applications.py", line 122, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/path/python3.9/site-packages/starlette/middleware/errors.py", line 184, in __call__
    raise exc
  File "/path/python3.9/site-packages/starlette/middleware/errors.py", line 162, in __call__
    await self.app(scope, receive, _send)
  File "/path/python3.9/site-packages/starlette/middleware/cors.py", line 83, in __call__
    await self.app(scope, receive, send)
  File "/path/python3.9/site-packages/starlette/middleware/exceptions.py", line 79, in __call__
    raise exc
  File "/path/python3.9/site-packages/starlette/middleware/exceptions.py", line 68, in __call__
    await self.app(scope, receive, sender)
  File "/path/python3.9/site-packages/fastapi/middleware/asyncexitstack.py", line 20, in __call__
    raise e
  File "/path/python3.9/site-packages/fastapi/middleware/asyncexitstack.py", line 17, in __call__
    await self.app(scope, receive, send)
  File "/path/python3.9/site-packages/starlette/routing.py", line 718, in __call__
    await route.handle(scope, receive, send)
  File "/path/python3.9/site-packages/starlette/routing.py", line 276, in handle
    await self.app(scope, receive, send)
  File "/path/python3.9/site-packages/starlette/routing.py", line 66, in app
    response = await func(request)
  File "/path/python3.9/site-packages/fastapi/routing.py", line 273, in app
    raw_response = await run_endpoint_function(
  File "/path/python3.9/site-packages/fastapi/routing.py", line 190, in run_endpoint_function
    return await dependant.call(**values)
  File "/var/www/qanda/main.py", line 39, in process_message
    result = CustomCanopyChat.chat(message_history)
  File "/var/www/qanda/src/domains/chat/actions/CustomCanopyChat.py", line 36, in chat
    response = chat_engine.chat(messages=messages, stream=False, namespace="qanda-content")
  File "/usr/local/lib/python3.9/dist-packages/canopy/chat_engine/chat_engine.py", line 201, in chat
    context = self._get_context(messages, namespace)
  File "/usr/local/lib/python3.9/dist-packages/canopy/chat_engine/chat_engine.py", line 234, in _get_context
    queries = self._query_builder.generate(messages, self.max_prompt_tokens)
  File "/usr/local/lib/python3.9/dist-packages/canopy/chat_engine/query_generator/function_calling.py", line 37, in generate
    messages = self._history_pruner.build(system_prompt=self._system_prompt,
  File "/usr/local/lib/python3.9/dist-packages/canopy/chat_engine/history_pruner/raising.py", line 19, in build
    raise ValueError(f"The history require {token_count} tokens, "
ValueError: The history require 5129 tokens, which exceeds the calculated limit for history of 4054 tokens left for history out of 4054 tokens allowed in context window.

Expected Behavior

I expected the knowledge base or context builder to reduce the history of messages to an acceptable window.

Steps To Reproduce

In any kind of setup of canopy
Add a ton of messages
Run the chat function on the ChatEngine class

Relevant log output

INFO:     127.0.0.1:36750 - "POST /chat HTTP/1.1" 500 Internal Server Error
ERROR:    Exception in ASGI application
Traceback (most recent call last):
  File "/path/python3.9/site-packages/uvicorn/protocols/http/h11_impl.py", line 408, in run_asgi
    result = await app(  # type: ignore[func-returns-value]
  File "/path/python3.9/site-packages/uvicorn/middleware/proxy_headers.py", line 84, in __call__
    return await self.app(scope, receive, send)
  File "/path/python3.9/site-packages/fastapi/applications.py", line 292, in __call__
    await super().__call__(scope, receive, send)
  File "/path/python3.9/site-packages/starlette/applications.py", line 122, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/path/python3.9/site-packages/starlette/middleware/errors.py", line 184, in __call__
    raise exc
  File "/path/python3.9/site-packages/starlette/middleware/errors.py", line 162, in __call__
    await self.app(scope, receive, _send)
  File "/path/python3.9/site-packages/starlette/middleware/cors.py", line 83, in __call__
    await self.app(scope, receive, send)
  File "/path/python3.9/site-packages/starlette/middleware/exceptions.py", line 79, in __call__
    raise exc
  File "/path/python3.9/site-packages/starlette/middleware/exceptions.py", line 68, in __call__
    await self.app(scope, receive, sender)
  File "/path/python3.9/site-packages/fastapi/middleware/asyncexitstack.py", line 20, in __call__
    raise e
  File "/path/python3.9/site-packages/fastapi/middleware/asyncexitstack.py", line 17, in __call__
    await self.app(scope, receive, send)
  File "/path/python3.9/site-packages/starlette/routing.py", line 718, in __call__
    await route.handle(scope, receive, send)
  File "/path/python3.9/site-packages/starlette/routing.py", line 276, in handle
    await self.app(scope, receive, send)
  File "/path/python3.9/site-packages/starlette/routing.py", line 66, in app
    response = await func(request)
  File "/path/python3.9/site-packages/fastapi/routing.py", line 273, in app
    raw_response = await run_endpoint_function(
  File "/path/python3.9/site-packages/fastapi/routing.py", line 190, in run_endpoint_function
    return await dependant.call(**values)
  File "/var/www/qanda/main.py", line 39, in process_message
    result = CustomCanopyChat.chat(message_history)
  File "/var/www/qanda/src/domains/chat/actions/CustomCanopyChat.py", line 36, in chat
    response = chat_engine.chat(messages=messages, stream=False, namespace="qanda-content")
  File "/usr/local/lib/python3.9/dist-packages/canopy/chat_engine/chat_engine.py", line 201, in chat
    context = self._get_context(messages, namespace)
  File "/usr/local/lib/python3.9/dist-packages/canopy/chat_engine/chat_engine.py", line 234, in _get_context
    queries = self._query_builder.generate(messages, self.max_prompt_tokens)
  File "/usr/local/lib/python3.9/dist-packages/canopy/chat_engine/query_generator/function_calling.py", line 37, in generate
    messages = self._history_pruner.build(system_prompt=self._system_prompt,
  File "/usr/local/lib/python3.9/dist-packages/canopy/chat_engine/history_pruner/raising.py", line 19, in build
    raise ValueError(f"The history require {token_count} tokens, "
ValueError: The history require 5129 tokens, which exceeds the calculated limit for history of 4054 tokens left for history out of 4054 tokens allowed in context window.

Environment

- **OS**: OSX 14.1.1
- **Language version**: Python 3.9.2
- **Canopy version**: 0.6.0

Additional Context

No response

The text was updated successfully, but these errors were encountered:

igiloh-pinecone · 2024-03-17T18:57:44Z

When you instantiate ChatEngine, there's a max_prompt_tokens which specifies the "context window".
Did you change the value to match the LLM you've selected?

coreation · 2024-03-18T09:10:30Z

Hi @igiloh-pinecone I think I'm running everything on defaults:

llm = OpenAILLM() # Defaults to GPT3.5
chat_engine = ChatEngine(context_engine=context_engine, llm=llm)

I see GPT3.5-turbo now has a context window of 16k tokens, I think in the past that was 4k but I could be wrong. Maybe that's what causing the issue? I can enter the values for the generated tokens, context tokens and max prompt tokens but I think canopy would keep that managed on its own when using defaults?

coreation · 2024-03-26T15:32:08Z

hey @igiloh-pinecone any update on this by any chance? I keep running into the issue from time to time, the default value for the ChatEngine is 4096 already, so I don't think passing the same value as the default will make much difference?

karthik-2512 · 2024-07-18T18:11:47Z

Hey @igiloh-pinecone I'm having the same issue of exceeding max tokens:
ValueError: The history require 4588 tokens, which exceeds the calculated limit for history of 4096 tokens left for history out of 4096 tokens allowed in context window.

I searched through issue and came across this open one. I went through the source code and I believe the issue is with RaisingHistoryPruner.build(...)

Within ChatEngine._get_context function, this line is executed:
queries = self._query_builder.generate(messages, self.max_prompt_tokens)
The query generator uses RaisingHistoryPruner and in the build function (unlike in RecentHistoryPruner) no truncating of history is happening:

     def build(self,
              chat_history: Messages,
              max_tokens: int,
              system_prompt: Optional[str] = None,
              context: Optional[Context] = None, ) -> Messages:
        max_tokens = self._max_tokens_history(max_tokens,
                                              system_prompt,
                                              context)
        token_count = self._tokenizer.messages_token_count(chat_history)
        if token_count > max_tokens:
            raise ValueError(f"The history require {token_count} tokens, "
                             f"which exceeds the calculated limit for history "
                             f"of {max_tokens} tokens left for"
                             f" history out of {max_tokens} tokens"
                             f" allowed in context window.")
        return chat_history

I'd would really appreciate it if this issue is fixed soon! Right now, I'm manually truncating my chat history to prevent this from happening. Thanks!

coreation added the bug Something isn't working label Mar 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] Context window is too large #323

[Bug] Context window is too large #323

coreation commented Mar 15, 2024

igiloh-pinecone commented Mar 17, 2024

coreation commented Mar 18, 2024

coreation commented Mar 26, 2024 •

edited

Loading

karthik-2512 commented Jul 18, 2024 •

edited

Loading

[Bug] Context window is too large #323

[Bug] Context window is too large #323

Comments

coreation commented Mar 15, 2024

Is this a new bug?

Current Behavior

Expected Behavior

Steps To Reproduce

Relevant log output

Environment

Additional Context

igiloh-pinecone commented Mar 17, 2024

coreation commented Mar 18, 2024

coreation commented Mar 26, 2024 • edited Loading

karthik-2512 commented Jul 18, 2024 • edited Loading

coreation commented Mar 26, 2024 •

edited

Loading

karthik-2512 commented Jul 18, 2024 •

edited

Loading