Skip to content
This repository has been archived by the owner on Nov 13, 2024. It is now read-only.

[Bug] Context window is too large #323

Open
2 tasks done
coreation opened this issue Mar 15, 2024 · 4 comments
Open
2 tasks done

[Bug] Context window is too large #323

coreation opened this issue Mar 15, 2024 · 4 comments
Labels
bug Something isn't working

Comments

@coreation
Copy link
Contributor

Is this a new bug?

  • I believe this is a new bug
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

I provide a window of context to the canopy chat, but it appears that it doesn't reduce that window sufficiently.
I'm using chat_engine.chat(messages=messages, stream=False, namespace="namespace"), are we expected to truncate the messages ourselves to a certain degree?

The output of the error

INFO:     127.0.0.1:36750 - "POST /chat HTTP/1.1" 500 Internal Server Error
ERROR:    Exception in ASGI application
Traceback (most recent call last):
  File "/path/python3.9/site-packages/uvicorn/protocols/http/h11_impl.py", line 408, in run_asgi
    result = await app(  # type: ignore[func-returns-value]
  File "/path/python3.9/site-packages/uvicorn/middleware/proxy_headers.py", line 84, in __call__
    return await self.app(scope, receive, send)
  File "/path/python3.9/site-packages/fastapi/applications.py", line 292, in __call__
    await super().__call__(scope, receive, send)
  File "/path/python3.9/site-packages/starlette/applications.py", line 122, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/path/python3.9/site-packages/starlette/middleware/errors.py", line 184, in __call__
    raise exc
  File "/path/python3.9/site-packages/starlette/middleware/errors.py", line 162, in __call__
    await self.app(scope, receive, _send)
  File "/path/python3.9/site-packages/starlette/middleware/cors.py", line 83, in __call__
    await self.app(scope, receive, send)
  File "/path/python3.9/site-packages/starlette/middleware/exceptions.py", line 79, in __call__
    raise exc
  File "/path/python3.9/site-packages/starlette/middleware/exceptions.py", line 68, in __call__
    await self.app(scope, receive, sender)
  File "/path/python3.9/site-packages/fastapi/middleware/asyncexitstack.py", line 20, in __call__
    raise e
  File "/path/python3.9/site-packages/fastapi/middleware/asyncexitstack.py", line 17, in __call__
    await self.app(scope, receive, send)
  File "/path/python3.9/site-packages/starlette/routing.py", line 718, in __call__
    await route.handle(scope, receive, send)
  File "/path/python3.9/site-packages/starlette/routing.py", line 276, in handle
    await self.app(scope, receive, send)
  File "/path/python3.9/site-packages/starlette/routing.py", line 66, in app
    response = await func(request)
  File "/path/python3.9/site-packages/fastapi/routing.py", line 273, in app
    raw_response = await run_endpoint_function(
  File "/path/python3.9/site-packages/fastapi/routing.py", line 190, in run_endpoint_function
    return await dependant.call(**values)
  File "/var/www/qanda/main.py", line 39, in process_message
    result = CustomCanopyChat.chat(message_history)
  File "/var/www/qanda/src/domains/chat/actions/CustomCanopyChat.py", line 36, in chat
    response = chat_engine.chat(messages=messages, stream=False, namespace="qanda-content")
  File "/usr/local/lib/python3.9/dist-packages/canopy/chat_engine/chat_engine.py", line 201, in chat
    context = self._get_context(messages, namespace)
  File "/usr/local/lib/python3.9/dist-packages/canopy/chat_engine/chat_engine.py", line 234, in _get_context
    queries = self._query_builder.generate(messages, self.max_prompt_tokens)
  File "/usr/local/lib/python3.9/dist-packages/canopy/chat_engine/query_generator/function_calling.py", line 37, in generate
    messages = self._history_pruner.build(system_prompt=self._system_prompt,
  File "/usr/local/lib/python3.9/dist-packages/canopy/chat_engine/history_pruner/raising.py", line 19, in build
    raise ValueError(f"The history require {token_count} tokens, "
ValueError: The history require 5129 tokens, which exceeds the calculated limit for history of 4054 tokens left for history out of 4054 tokens allowed in context window.

Expected Behavior

I expected the knowledge base or context builder to reduce the history of messages to an acceptable window.

Steps To Reproduce

  1. In any kind of setup of canopy
  2. Add a ton of messages
  3. Run the chat function on the ChatEngine class

Relevant log output

INFO:     127.0.0.1:36750 - "POST /chat HTTP/1.1" 500 Internal Server Error
ERROR:    Exception in ASGI application
Traceback (most recent call last):
  File "/path/python3.9/site-packages/uvicorn/protocols/http/h11_impl.py", line 408, in run_asgi
    result = await app(  # type: ignore[func-returns-value]
  File "/path/python3.9/site-packages/uvicorn/middleware/proxy_headers.py", line 84, in __call__
    return await self.app(scope, receive, send)
  File "/path/python3.9/site-packages/fastapi/applications.py", line 292, in __call__
    await super().__call__(scope, receive, send)
  File "/path/python3.9/site-packages/starlette/applications.py", line 122, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/path/python3.9/site-packages/starlette/middleware/errors.py", line 184, in __call__
    raise exc
  File "/path/python3.9/site-packages/starlette/middleware/errors.py", line 162, in __call__
    await self.app(scope, receive, _send)
  File "/path/python3.9/site-packages/starlette/middleware/cors.py", line 83, in __call__
    await self.app(scope, receive, send)
  File "/path/python3.9/site-packages/starlette/middleware/exceptions.py", line 79, in __call__
    raise exc
  File "/path/python3.9/site-packages/starlette/middleware/exceptions.py", line 68, in __call__
    await self.app(scope, receive, sender)
  File "/path/python3.9/site-packages/fastapi/middleware/asyncexitstack.py", line 20, in __call__
    raise e
  File "/path/python3.9/site-packages/fastapi/middleware/asyncexitstack.py", line 17, in __call__
    await self.app(scope, receive, send)
  File "/path/python3.9/site-packages/starlette/routing.py", line 718, in __call__
    await route.handle(scope, receive, send)
  File "/path/python3.9/site-packages/starlette/routing.py", line 276, in handle
    await self.app(scope, receive, send)
  File "/path/python3.9/site-packages/starlette/routing.py", line 66, in app
    response = await func(request)
  File "/path/python3.9/site-packages/fastapi/routing.py", line 273, in app
    raw_response = await run_endpoint_function(
  File "/path/python3.9/site-packages/fastapi/routing.py", line 190, in run_endpoint_function
    return await dependant.call(**values)
  File "/var/www/qanda/main.py", line 39, in process_message
    result = CustomCanopyChat.chat(message_history)
  File "/var/www/qanda/src/domains/chat/actions/CustomCanopyChat.py", line 36, in chat
    response = chat_engine.chat(messages=messages, stream=False, namespace="qanda-content")
  File "/usr/local/lib/python3.9/dist-packages/canopy/chat_engine/chat_engine.py", line 201, in chat
    context = self._get_context(messages, namespace)
  File "/usr/local/lib/python3.9/dist-packages/canopy/chat_engine/chat_engine.py", line 234, in _get_context
    queries = self._query_builder.generate(messages, self.max_prompt_tokens)
  File "/usr/local/lib/python3.9/dist-packages/canopy/chat_engine/query_generator/function_calling.py", line 37, in generate
    messages = self._history_pruner.build(system_prompt=self._system_prompt,
  File "/usr/local/lib/python3.9/dist-packages/canopy/chat_engine/history_pruner/raising.py", line 19, in build
    raise ValueError(f"The history require {token_count} tokens, "
ValueError: The history require 5129 tokens, which exceeds the calculated limit for history of 4054 tokens left for history out of 4054 tokens allowed in context window.

Environment

- **OS**: OSX 14.1.1
- **Language version**: Python 3.9.2
- **Canopy version**: 0.6.0

Additional Context

No response

@coreation coreation added the bug Something isn't working label Mar 15, 2024
@igiloh-pinecone
Copy link
Contributor

When you instantiate ChatEngine, there's a max_prompt_tokens which specifies the "context window".
Did you change the value to match the LLM you've selected?

@coreation
Copy link
Contributor Author

Hi @igiloh-pinecone I think I'm running everything on defaults:

llm = OpenAILLM() # Defaults to GPT3.5
chat_engine = ChatEngine(context_engine=context_engine, llm=llm)

I see GPT3.5-turbo now has a context window of 16k tokens, I think in the past that was 4k but I could be wrong. Maybe that's what causing the issue? I can enter the values for the generated tokens, context tokens and max prompt tokens but I think canopy would keep that managed on its own when using defaults?

@coreation
Copy link
Contributor Author

coreation commented Mar 26, 2024

hey @igiloh-pinecone any update on this by any chance? I keep running into the issue from time to time, the default value for the ChatEngine is 4096 already, so I don't think passing the same value as the default will make much difference?

@karthik-2512
Copy link

karthik-2512 commented Jul 18, 2024

Hey @igiloh-pinecone I'm having the same issue of exceeding max tokens:
ValueError: The history require 4588 tokens, which exceeds the calculated limit for history of 4096 tokens left for history out of 4096 tokens allowed in context window.

I searched through issue and came across this open one. I went through the source code and I believe the issue is with RaisingHistoryPruner.build(...)

Within ChatEngine._get_context function, this line is executed:
queries = self._query_builder.generate(messages, self.max_prompt_tokens)
The query generator uses RaisingHistoryPruner and in the build function (unlike in RecentHistoryPruner) no truncating of history is happening:

     def build(self,
              chat_history: Messages,
              max_tokens: int,
              system_prompt: Optional[str] = None,
              context: Optional[Context] = None, ) -> Messages:
        max_tokens = self._max_tokens_history(max_tokens,
                                              system_prompt,
                                              context)
        token_count = self._tokenizer.messages_token_count(chat_history)
        if token_count > max_tokens:
            raise ValueError(f"The history require {token_count} tokens, "
                             f"which exceeds the calculated limit for history "
                             f"of {max_tokens} tokens left for"
                             f" history out of {max_tokens} tokens"
                             f" allowed in context window.")
        return chat_history

I'd would really appreciate it if this issue is fixed soon! Right now, I'm manually truncating my chat history to prevent this from happening. Thanks!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants