-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(max): Conversational assistant #28154
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR Summary
This PR significantly restructures the assistant framework by replacing the Router and Summarizer nodes with Root and QueryExecutor nodes, creating a more conversational and capable AI assistant named Max.
- Introduced new
Root
node in/ee/hogai/root/nodes.py
that handles both conversation flow and insight type selection through tool calling - Added
QueryExecutor
node in/ee/hogai/query_executor/nodes.py
that processes query results without LLM summarization, returning control to Root for interpretation - Several test files still reference old architecture (e.g.
test_eval_router.py
using RouterMessage), requiring updates for consistency - Patch paths in
/ee/hogai/root/test/test_nodes.py
incorrectly reference old router module instead of new root module - Missing instrumentation for query planning/generation nodes as noted in PR description, potentially impacting debugging capabilities
26 file(s) reviewed, 17 comment(s)
Edit PR Review Bot Settings | Greptile
def add_query_executor(self, next_node: AssistantNodeName = AssistantNodeName.ROOT): | ||
builder = self._graph | ||
summarizer_node = SummarizerNode(self._team) | ||
builder.add_node(AssistantNodeName.SUMMARIZER, summarizer_node.run) | ||
builder.add_edge(AssistantNodeName.SUMMARIZER, next_node) | ||
query_executor_node = QueryExecutorNode(self._team) | ||
builder.add_node(AssistantNodeName.QUERY_EXECUTOR, query_executor_node.run) | ||
builder.add_edge(AssistantNodeName.QUERY_EXECUTOR, next_node) | ||
return self |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: add_query_executor() should validate that next_node is not END since the new architecture requires returning to ROOT for summarization
def add_query_executor(self, next_node: AssistantNodeName = AssistantNodeName.ROOT): | |
builder = self._graph | |
summarizer_node = SummarizerNode(self._team) | |
builder.add_node(AssistantNodeName.SUMMARIZER, summarizer_node.run) | |
builder.add_edge(AssistantNodeName.SUMMARIZER, next_node) | |
query_executor_node = QueryExecutorNode(self._team) | |
builder.add_node(AssistantNodeName.QUERY_EXECUTOR, query_executor_node.run) | |
builder.add_edge(AssistantNodeName.QUERY_EXECUTOR, next_node) | |
return self | |
def add_query_executor(self, next_node: AssistantNodeName = AssistantNodeName.ROOT): | |
if next_node == AssistantNodeName.END: | |
raise ValueError("Query executor must return to ROOT for summarization") | |
builder = self._graph | |
query_executor_node = QueryExecutorNode(self._team) | |
builder.add_node(AssistantNodeName.QUERY_EXECUTOR, query_executor_node.run) | |
builder.add_edge(AssistantNodeName.QUERY_EXECUTOR, next_node) | |
return self |
elif node_name in STREAMING_NODES and langchain_message.content: # Also checking content to ignore tools | ||
self._chunks += langchain_message # type: ignore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: Type ignore on line 268 should be replaced with proper type handling for AIMessageChunk concatenation
The current date and time is {{{utc_datetime_display}}} UTC, which is {{{project_datetime_display}}} in this project's timezone ({{{project_timezone}}}). | ||
It's expected that the data point for the current period can have a drop in value, as it's not complete yet. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: Consider adding a more descriptive explanation of why data points might drop. For example, mention that data collection is still in progress for the current period.
Here's the query I came up with: | ||
```json | ||
{{{query}}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: The triple mustache syntax {{{query}}} will not escape HTML entities. Ensure that the query content is properly sanitized before insertion to prevent XSS vulnerabilities.
@@ -26,8 +26,8 @@ class TestSummarizerNode(ClickhouseTestMixin, APIBaseTest): | |||
|
|||
@patch("ee.hogai.summarizer.nodes.process_query_dict", side_effect=process_query_dict) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: The import path in the patch decorator still references 'summarizer.nodes' instead of 'query_executor.nodes'
@patch("ee.hogai.summarizer.nodes.process_query_dict", side_effect=process_query_dict) | |
@patch("ee.hogai.query_executor.nodes.process_query_dict", side_effect=process_query_dict) |
""" | ||
|
||
POST_QUERY_USER_PROMPT = """ | ||
Okay, so let's get back what I was a asking. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
syntax: grammar error in prompt text
Okay, so let's get back what I was a asking. | |
Okay, so let's get back to what I was asking. |
You have access to data retrieval tools. When a question is data-related, proactively call a single tool to retrieve concrete results. | ||
If the user asked for a tweak to an earlier query, call that tool as well to apply necessary changes. | ||
When calling a tool, ALWAYS first tell the user you're doing so, very briefly. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: no clear guidance on what constitutes a 'single tool' or how to handle cases where multiple data points are needed for a complete answer
""" | ||
|
||
|
||
ROOT_INSIGHT_DESCRIPTION_PROMPT = f""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: f-string is used but no string interpolation is present - this could cause issues if variables are added later
ROOT_INSIGHT_DESCRIPTION_PROMPT = f""" | |
ROOT_INSIGHT_DESCRIPTION_PROMPT = """ |
If information is missing or there is a potential data issue, retrieve a different new analysis instead of giving a subpar answer. | ||
Any time you're about to retrieve more data, SAY SO FIRST. And NEVER retrieve data more than 3 times in a row. | ||
Avoid generic advice. Take into account what you know about the product. Your answer needs to be super high-impact, no more than a few sentences. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: the '3 times in a row' limit needs clarification on timeframe and what constitutes a reset of this counter
@@ -56,7 +56,7 @@ class AssistantNodeName(StrEnum): | |||
MEMORY_ONBOARDING = "memory_onboarding" | |||
MEMORY_INITIALIZER = "memory_initializer" | |||
MEMORY_INITIALIZER_INTERRUPT = "memory_initializer_interrupt" | |||
ROUTER = "router" | |||
ROOT = "router" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: Inconsistent naming - node is called ROOT but string value is still 'router'. This could cause confusion and should be updated to 'root' for consistency.
Size Change: +771 B (+0.07%) Total Size: 1.17 MB ℹ️ View Unchanged
|
Changes
This PR rearchitects the assistant into a more capable agent:
Router
node is replaced withRoot
.Root
is still capable of choosing a relevant insight type to create, via tool calling, but it's only part of its job. Most importantly,Root
is now what the user actually talks with at all times. Its prompt includes lots of personality nudges (lifted from @slshults's work on docs Max) as well as orientation towards data analysis.Summarizer
node is replaced withQueryExecutor
. LikeSummarizer
,QueryExecutor
takes the result of the precedingTrendsGenerator
/FunnelGenerator
/RetentionGenerator
step, but only adds the results to the message history instead of calling an LLM on top of them.Combined, the key architectural change is added recursion:
QueryExecutor
goes back toRoot
for either summarization of results or a decision to analyze deeper. This is currently rather simple, and we're still missing instrumentation on query planning/generation nodes to e.g. edit queries from the past – but this architecture should take us far, as we can easily fan out into multiple sub-questions from the root, to combine them when all are done.How did you test this code?
TODO