[Bug]: Incorrect Output in ReAct Mode of LlamaIndex Chat Engine #17322

whisper-bye · 2024-12-19T07:08:22Z

Bug Description

When using the ReAct mode of the LlamaIndex chat engine, the output contains duplicated and extra characters that are not expected.

Version

0.12.5-0.12.7

Steps to Reproduce

message = messages[-1]["content"]
chat_history = [
    ChatMessage(role=message["role"], content=message["content"])
    for message in messages[:-1]
]

chat_engine = self.index.as_chat_engine()
streaming_response = await chat_engine.astream_chat(
    message=message, chat_history=chat_history
)

async for token in streaming_response.async_response_gen():
    print(token)

Setup set_global_handler("simple").
Send some questions.
And Then send the message 你好.

Observe the expected log output:

你好！请问你需要了解哪些信息呢？我可以帮助你查询相关信息。

Check the actual output in ReAct mode:

你好！请问你好！请问你需要了解哪些信息呢？我可以帮助你查询相关信息。

All other ChatMode works fine.

Relevant Logs/Tracbacks

The text was updated successfully, but these errors were encountered:

dosubot · 2024-12-19T07:11:29Z

The issue you're experiencing with duplicated and extra characters in the ReAct mode of the LlamaIndex chat engine could be related to how the astream_chat method processes messages and generates responses. The astream_chat method uses an asynchronous generator to yield response tokens incrementally, which might lead to duplication if the message handling or memory update logic is not correctly implemented ^[1].

Here are a few things you might want to check or try:

Message Handling: Ensure that the chat history and message handling logic correctly manage the state and do not inadvertently duplicate messages. The chat history should be accurately set and updated to prevent repeated processing of the same message.
Response Generation: Verify that the response synthesizer and the asynchronous generator (async_response_gen) are correctly configured to handle streaming responses without duplicating tokens. The generator should yield each token only once ^[2].
Version-Specific Changes: Since the issue occurs between versions 0.12.5 and 0.12.7, reviewing the changelogs or commits for these versions might provide insights into any changes that could affect message processing or response generation.

If these steps do not resolve the issue, further investigation into the specific implementation details of the ReAct mode in these versions might be necessary.

To continue talking to Dosu, mention @dosu.

^{Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other}

whisper-bye · 2024-12-19T08:27:40Z

@logan-markewich

I think I roughly understand the issue. Since 你好 is a simple message, the llm didn't follow the react prompt template but directly output the result 你好！请问你需要了解哪些信息呢？我可以帮助你查询相关信息。

At https://github.com/run-llama/llama_index/blob/v0.12.6/llama-index-core/llama_index/core/agent/react/step.py#L750:

In the OpenAI protocol-compatible streaming output, the content of the final chunk is empty.
The latest_chunk and missed_chunks_storage were modified in _infer_stream_chunk_is_final.

missed_chunks_storage is ['你好', '！', '请问']

latest_chunk is '你好！请问。。。'

Eventually, the result was reassembled, leading to the strange output described in my issue:

# add back the chunks that were missed
response_stream = self._async_add_back_chunk_to_stream(
    chunks=[*missed_chunks_storage, latest_chunk], chat_stream=chat_stream
)

whisper-bye added bug Something isn't working triage Issue needs to be triaged/prioritized labels Dec 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: Incorrect Output in ReAct Mode of LlamaIndex Chat Engine #17322

[Bug]: Incorrect Output in ReAct Mode of LlamaIndex Chat Engine #17322

whisper-bye commented Dec 19, 2024 •

edited

Loading

dosubot bot commented Dec 19, 2024

whisper-bye commented Dec 19, 2024 •

edited

Loading

[Bug]: Incorrect Output in ReAct Mode of LlamaIndex Chat Engine #17322

[Bug]: Incorrect Output in ReAct Mode of LlamaIndex Chat Engine #17322

Comments

whisper-bye commented Dec 19, 2024 • edited Loading

Bug Description

Version

Steps to Reproduce

Relevant Logs/Tracbacks

dosubot bot commented Dec 19, 2024

whisper-bye commented Dec 19, 2024 • edited Loading

whisper-bye commented Dec 19, 2024 •

edited

Loading

whisper-bye commented Dec 19, 2024 •

edited

Loading