-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Added references to answers! #103
Conversation
WalkthroughThe pull request introduces significant modifications across multiple files, primarily enhancing the functionality and clarity of query handling and response structures. Key changes include the introduction of new classes and methods, updates to existing method signatures and return types, and the addition of unit tests for the Changes
Poem
Warning Rate limit exceeded@amindadgar has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 7 minutes and 37 seconds before requesting another review. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
🧹 Outside diff range and nitpick comments (12)
utils/query_engine/subquestion_engine.py (2)
12-12
: Remove duplicate import of 'Sequence'The
Sequence
type is imported twice: at line 1 and again at line 12. Please remove the duplicate import at line 12 to avoid redundancy.Apply this diff to remove the redundant import:
-from typing import List, Optional, Sequence, cast +from typing import List, Optional, cast🧰 Tools
🪛 Ruff (0.8.0)
12-12: Redefinition of unused
Sequence
from line 1Remove definition:
Sequence
(F811)
77-80
: Unnecessary override ofquery
method without modificationThe
query
method is overridden without any additional logic—it simply calls the superclass method. If there is no custom behavior, consider removing this override to avoid unnecessary code.Apply this diff to remove the redundant method:
- def query( - self, str_or_query_bundle: str | QueryBundle - ) -> tuple[RESPONSE_TYPE, list[NodeWithScore]]: - return super().query(str_or_query_bundle)utils/query_engine/dual_qdrant_retrieval_engine.py (1)
41-41
: Add return type annotation tocustom_query
methodSince the
custom_query
method now returns aResponse
object, it's recommended to add a return type annotation for clarity and consistency.Apply this diff to update the method signature:
- def custom_query(self, query_str: str): + def custom_query(self, query_str: str) -> Response:worker/tasks.py (1)
Line range hint
18-31
: Update return type and docstring forask_question_auto_search
functionThe function
ask_question_auto_search
now returns a dictionary that includesreferences
, which is a list ofNodeWithScore
objects. Please update the return type annotation and the docstring to reflect this change, ensuring that the return type is accurate.Apply this diff to update the function signature:
def ask_question_auto_search( community_id: str, query: str, - ) -> dict[str, str]: + ) -> dict[str, Any]:And update the docstring accordingly.
utils/query_engine/prepare_answer_sources.py (3)
7-8
: Consider adding validation for threshold valueThe threshold parameter should be validated to ensure it's between 0 and 1, as it represents a confidence score.
def __init__(self, threshold: float = 0.7) -> None: + if not 0 <= threshold <= 1: + raise ValueError("Threshold must be between 0 and 1") self.threshold = threshold
40-44
: Consider deduplicating URLsURLs from different nodes might point to the same source. Consider using a set to deduplicate URLs.
- urls = [ + urls = list(set( node.metadata.get("url") for node in tool_nodes.sources if node.score >= self.threshold and node.metadata.get("url") is not None - ] + ))
52-57
: Improve error message clarityThe error message could be more descriptive about what action the user should take.
logging.error( - f"All node scores are below threshold. threshold: {self.threshold}" - ". returning empty string!" + f"No sources found with confidence score >= {self.threshold}. " + "Consider lowering the threshold if sources are expected." )routers/amqp.py (1)
47-47
: Consider separating response and references in the response modelConcatenating response and references with newlines makes it harder for clients to process them separately.
Consider updating the ResponseModel to handle references as a separate field:
- response=ResponseModel(message=f"{response}\n\n{answer_reference}"), + response=ResponseModel( + message=response, + references=answer_reference + ),tests/unit/test_prepare_answer_sources.py (2)
8-164
: Consider adding more edge cases to strengthen test coverage.The existing test cases are well-structured and comprehensive. Consider adding these additional test cases to improve coverage:
- Test with malformed URLs
- Test with extremely long URLs
- Test with special characters in URLs
- Test with duplicate URLs across different tools
165-212
: Consider verifying error logging for edge cases.While the edge cases for None/missing URLs are well-handled, consider adding assertions to verify that appropriate warnings or debug logs are generated when encountering these cases.
utils/query_engine/level_based_platform_query_engine.py (2)
Line range hint
236-350
: Consider refactoring _prepare_context_str for better maintainability.The context preparation logic is complex with multiple nested conditions. Consider:
- Extracting the summary node fetching logic into a separate method
- Creating helper methods for node grouping and context string preparation
- Adding more detailed error handling and logging for each path
Example refactoring:
- def _prepare_context_str(self, raw_nodes: list[NodeWithScore], summary_nodes: list[NodeWithScore] | None) -> str: + def _prepare_context_str(self, raw_nodes: list[NodeWithScore], summary_nodes: list[NodeWithScore] | None) -> str: + if not raw_nodes: + return "" + + if summary_nodes == []: + return self._prepare_context_without_summaries(raw_nodes) + + if summary_nodes is None: + return self._prepare_context_with_fetched_summaries(raw_nodes) + + return self._prepare_context_with_provided_summaries(raw_nodes, summary_nodes) + + def _prepare_context_without_summaries(self, raw_nodes: list[NodeWithScore]) -> str: + logging.warning("Empty context_nodes. Cannot add summaries as context information!") + return self._utils_class.prepare_prompt_with_metadata_info(nodes=raw_nodes)
Line range hint
236-350
: Enhance error handling and logging.Consider improving the error handling and logging:
- Add structured logging with more context
- Handle potential exceptions in node retrieval
- Add debug logging for performance metrics
Example enhancement:
def _prepare_context_str(self, raw_nodes: list[NodeWithScore], summary_nodes: list[NodeWithScore] | None) -> str: + logging.debug("Preparing context string with %d raw nodes and %s summary nodes", + len(raw_nodes), + len(summary_nodes) if summary_nodes is not None else "no") try: # ... existing logic ... + except Exception as e: + logging.error("Failed to prepare context string: %s", str(e)) + raise
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (10)
routers/amqp.py
(2 hunks)subquery.py
(3 hunks)tests/unit/test_prepare_answer_sources.py
(1 hunks)utils/mongo.py
(1 hunks)utils/query_engine/__init__.py
(1 hunks)utils/query_engine/dual_qdrant_retrieval_engine.py
(4 hunks)utils/query_engine/level_based_platform_query_engine.py
(2 hunks)utils/query_engine/prepare_answer_sources.py
(1 hunks)utils/query_engine/subquestion_engine.py
(1 hunks)worker/tasks.py
(5 hunks)
🧰 Additional context used
🪛 Ruff (0.8.0)
utils/query_engine/subquestion_engine.py
12-12: Redefinition of unused Sequence
from line 1
Remove definition: Sequence
(F811)
🔇 Additional comments (5)
utils/query_engine/__init__.py (1)
8-8
: Import statement for CustomSubQuestionQueryEngine
added correctly
The import statement correctly adds CustomSubQuestionQueryEngine
from subquestion_engine
.
utils/mongo.py (1)
43-43
: Verify the necessity of directConnection=true
in the MongoDB URI
Adding ?directConnection=true
to the connection string may impact connection behavior, especially in clustered or replica set environments. Please verify that this parameter is necessary and that it won't cause issues with your MongoDB deployment.
subquery.py (1)
212-216
: Verify the impact of switching to CustomSubQuestionQueryEngine
The switch from SubQuestionQueryEngine to CustomSubQuestionQueryEngine might affect query behavior.
✅ Verification successful
CustomSubQuestionQueryEngine is a safe extension of SubQuestionQueryEngine
The verification shows that CustomSubQuestionQueryEngine
is a direct subclass of SubQuestionQueryEngine
with minimal modifications. It maintains the same interface and initialization parameters, only overriding the _query
method while preserving the core functionality. The implementation shows no breaking changes or behavioral differences that would impact existing query patterns.
- The class properly inherits and calls the parent's
__init__
with all parameters - The
_query
method maintains the same signature and return type - The usage pattern in
subquery.py
follows the standardfrom_defaults
initialization
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Check for any other usages of SubQuestionQueryEngine that might need updating
rg "SubQuestionQueryEngine" --type python
# Look for any configuration differences between the two engines
ast-grep --pattern 'class CustomSubQuestionQueryEngine {
$$$
}'
Length of output: 140
Script:
#!/bin/bash
# Search for SubQuestionQueryEngine usage with ripgrep
rg "SubQuestionQueryEngine"
# Look for the CustomSubQuestionQueryEngine class definition
rg "class CustomSubQuestionQueryEngine" -A 10
# Search for any imports of these query engines
rg "from.*SubQuestionQueryEngine.*import"
Length of output: 1409
Script:
#!/bin/bash
# Get the full implementation of CustomSubQuestionQueryEngine to check for any behavioral changes
rg -A 50 "class CustomSubQuestionQueryEngine" utils/query_engine/subquestion_engine.py
# Check for any specific usage patterns or configurations
rg "CustomSubQuestionQueryEngine\.from_defaults" -A 5
Length of output: 2296
tests/unit/test_prepare_answer_sources.py (1)
1-6
: LGTM! Well-structured test imports and organization.
The test file is well-organized with appropriate imports and clear separation of concerns.
utils/query_engine/level_based_platform_query_engine.py (1)
18-18
: LGTM! Good enhancement to response structure.
The change to return a Response object with source nodes improves the API by providing structured access to both the response text and its sources.
Also applies to: 55-55
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Outside diff range and nitpick comments (7)
worker/tasks.py (2)
Line range hint
21-24
: Initialize references in the error handling pathThe error handling path doesn't initialize the
references
variable, but it's used in the return statement. This will raise an UnboundLocalError.Apply this fix:
except Exception: response = "Sorry, We cannot process your question at the moment." + references = [] logging.error( f"Errors raised while processing the question for community: {community_id}!" )
Line range hint
49-66
: Update function docstring to reflect new return typeThe docstring needs to be updated to document the new return type that includes references.
Apply this diff:
Returns --------- response : str the LLM's response + references : list[NodeWithScore] + the source nodes that were used to generate the responseutils/query_engine/dual_qdrant_retrieval_engine.py (2)
177-184
: LGTM! Consider adding debug loggingThe implementation correctly returns both the response and source nodes. Consider adding debug logging to track the number of source nodes being returned.
Add logging before returning:
+ logging.debug(f"Returning response with {len(nodes)} source nodes") return Response(response=str(response), source_nodes=nodes)
Line range hint
186-215
: LGTM! Consider adding debug logging for summary processingThe summary query processing is well-implemented with proper fallback to basic query when no dates are found.
Add logging for better observability:
if not dates: + logging.debug("No dates found in summary nodes, falling back to basic query") return self._process_basic_query(query_str)
utils/query_engine/level_based_platform_query_engine.py (2)
55-55
: LGTM! Consider enhancing the response with metadataThe implementation correctly returns both the response and source nodes.
Consider adding metadata about the number of nodes used:
- return Response(response=str(response), source_nodes=similar_nodes) + return Response( + response=str(response), + source_nodes=similar_nodes, + metadata={"num_nodes_used": len(similar_nodes)} + )
Line range hint
249-252
: Enhance warning message for empty context nodesThe warning message could be more informative by including the query context.
Improve the warning message:
- logging.warning( - "Empty context_nodes. Cannot add summaries as context information!" - ) + logging.warning( + "Empty context_nodes for query '%s'. Cannot add summaries as context information!", + query_str + )utils/query_engine/subquestion_engine.py (1)
48-61
: Consider extracting query execution logicThe query execution logic for both sync and async paths could be extracted into separate methods to improve readability and maintainability.
Consider refactoring like this:
def _execute_queries_async(self, sub_questions: list, colors: dict) -> List[Optional[SubQuestionAnswerPair]]: tasks = [ self._aquery_subq(sub_q, color=colors[str(ind)]) for ind, sub_q in enumerate(sub_questions) ] results = run_async_tasks(tasks) return cast(List[Optional[SubQuestionAnswerPair]], results) def _execute_queries_sync(self, sub_questions: list, colors: dict) -> List[Optional[SubQuestionAnswerPair]]: return [ self._query_subq(sub_q, color=colors[str(ind)]) for ind, sub_q in enumerate(sub_questions) ]
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (6)
tests/unit/test_prepare_answer_sources.py
(1 hunks)utils/query_engine/__init__.py
(1 hunks)utils/query_engine/dual_qdrant_retrieval_engine.py
(4 hunks)utils/query_engine/level_based_platform_query_engine.py
(2 hunks)utils/query_engine/subquestion_engine.py
(1 hunks)worker/tasks.py
(5 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- utils/query_engine/init.py
- tests/unit/test_prepare_answer_sources.py
🧰 Additional context used
🪛 Ruff (0.8.0)
utils/query_engine/subquestion_engine.py
2-2: Redefinition of unused Sequence
from line 1
Remove definition: Sequence
(F811)
🔇 Additional comments (5)
worker/tasks.py (1)
31-31
: Ensure references
are serializable for client consumption
The references
field contains NodeWithScore
objects, which may not be directly serializable to JSON.
utils/query_engine/subquestion_engine.py (4)
16-34
: LGTM! Well-structured class initialization
The constructor is well-implemented with proper type hints and parent class initialization.
78-81
: LGTM! Clean implementation of query method
The method correctly delegates to the parent class implementation while maintaining proper type hints.
35-76
:
Fix return type mismatch
The method still returns qa_pairs_all
which is List[Optional[SubQuestionAnswerPair]]
instead of the declared list[NodeWithScore]
.
- return response, qa_pairs_all
+ return response, nodes
35-37
: Verify input validation and performance implications
Please ensure that:
- The query_bundle input is properly validated before processing
- The async implementation properly handles timeouts and error cases
✅ Verification successful
Let me gather more specific information about the query_bundle validation and error handling in the subquestion engine.
Let me try a different approach to find the implementation details.
Let me try one final time without the file type flag.
Based on the code snippets found, I can now provide a final response:
Input validation and async concerns are properly handled
The implementation shows:
- QueryBundle is a typed parameter, ensuring type safety
- The query processing is wrapped in a callback manager event system for proper execution tracking
- Sub-questions generation is controlled and monitored
- The method returns a well-defined tuple type with response and node scores
The code follows good practices with:
- Strong type hints
- Event-based monitoring
- Structured error propagation through the callback system
- Clear return type specification
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Search for input validation and error handling patterns
rg -A 5 "def.*query_bundle.*"
rg -A 5 "raise.*Error"
# Look for timeout configurations in async operations
rg -A 5 "timeout|wait_for"
Length of output: 7216
Script:
#!/bin/bash
# Check QueryBundle class definition and validation
ast-grep --pattern 'class QueryBundle'
# Check subquestion engine's query method implementation
ast-grep --pattern 'def _query'
# Look for error handling in subquestion engine
rg -A 10 "class SubQuestionQueryEngine"
Length of output: 116
Script:
#!/bin/bash
# Search for query_bundle usage and validation patterns
rg -A 10 "query_bundle:" --type python
# Look for SubQuestionQueryEngine implementation
rg -A 15 "class SubQuestionQueryEngine" --type python
# Search for error handling in query methods
rg -A 10 "def _query.*QueryBundle" --type python
Length of output: 238
Script:
#!/bin/bash
# Search for query_bundle usage and validation patterns
rg -A 10 "query_bundle:"
# Look for SubQuestionQueryEngine implementation
rg -A 15 "class SubQuestionQueryEngine"
# Search for error handling in query methods
rg -A 10 "def _query.*QueryBundle"
Length of output: 1365
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (3)
worker/tasks.py (1)
18-24
: Consider making the threshold configurableThe threshold value of 0.7 in
PrepareAnswerSources(threshold=0.7)
is hardcoded. Consider making this configurable through environment variables or configuration settings for better flexibility.- answer_sources = PrepareAnswerSources(threshold=0.7).prepare_answer_sources(nodes=references) + answer_sources = PrepareAnswerSources(threshold=settings.ANSWER_SOURCES_THRESHOLD).prepare_answer_sources(nodes=references)Also applies to: 33-33
utils/query_engine/subquestion_engine.py (2)
15-32
: Add class and method docstringsPlease add docstrings to describe:
- The purpose and behavior of the CustomSubQuestionQueryEngine class
- The parameters and their purpose in the init method
77-80
: Consider adding explicit return type annotationWhile the return type is specified in the method signature, it would be clearer to add a return type annotation to the super().query call for better code readability.
- return super().query(str_or_query_bundle) + return super().query(str_or_query_bundle) -> tuple[RESPONSE_TYPE, list[NodeWithScore]]
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (2)
utils/query_engine/subquestion_engine.py
(1 hunks)worker/tasks.py
(4 hunks)
🔇 Additional comments (4)
worker/tasks.py (3)
5-5
: LGTM!
The new imports are appropriate for handling the references functionality.
Also applies to: 8-8
52-52
: LGTM!
The return type change and docstring updates accurately reflect the new functionality of returning references along with the response.
Also applies to: 67-68
74-74
: LGTM!
The implementation properly handles and returns both the response and references from query_multiple_source.
Also applies to: 80-80
utils/query_engine/subquestion_engine.py (1)
75-75
:
Return value doesn't match declared type
The method declares tuple[RESPONSE_TYPE, list[NodeWithScore]]
as return type but returns qa_pairs_all
which is List[Optional[SubQuestionAnswerPair]]
. This mismatch could cause type errors.
Fix the return statement:
- return response, qa_pairs_all
+ return response, nodes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Outside diff range and nitpick comments (1)
worker/tasks.py (1)
22-24
: Consider making the threshold configurableThe threshold value of 0.7 is hardcoded. Consider making this configurable through environment variables or configuration settings for better flexibility.
- answer_sources = PrepareAnswerSources(threshold=0.7).prepare_answer_sources( + answer_sources = PrepareAnswerSources(threshold=settings.ANSWER_SOURCES_THRESHOLD).prepare_answer_sources(
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (1)
worker/tasks.py
(5 hunks)
🔇 Additional comments (3)
worker/tasks.py (3)
5-5
: LGTM: Import changes align with new functionality
The new imports for NodeWithScore
and PrepareAnswerSources
are appropriate for the added reference handling functionality.
Also applies to: 8-8
35-36
: Ensure references are properly serialized
Previous review comments highlighted serialization concerns. Verify that PrepareAnswerSources.prepare_answer_sources()
returns JSON-serializable data.
#!/bin/bash
# Check the return type of prepare_answer_sources
ast-grep --pattern 'def prepare_answer_sources($$$): $$$'
77-83
: Consider memory management for large reference lists
While the function correctly returns both response and references, consider implementing pagination or limiting the number of references returned to prevent memory issues with large datasets.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (1)
worker/tasks.py (1)
22-24
: Consider making the similarity threshold configurableThe similarity threshold of 0.7 is hardcoded. Consider making this configurable through environment variables or configuration settings for better flexibility.
- answer_sources = PrepareAnswerSources(threshold=0.7).prepare_answer_sources( + answer_sources = PrepareAnswerSources( + threshold=float(os.getenv('ANSWER_SIMILARITY_THRESHOLD', '0.7')) + ).prepare_answer_sources(
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (1)
worker/tasks.py
(5 hunks)
🔇 Additional comments (4)
worker/tasks.py (4)
5-5
: LGTM: Import statements are appropriate
The new imports for NodeWithScore
and PrepareAnswerSources
are correctly added and necessary for the reference handling functionality.
Also applies to: 8-8
55-55
: LGTM: Return type and documentation updates
The function signature and documentation are properly updated to reflect the new return value including references.
Also applies to: 70-71
77-83
: Ensure NodeWithScore objects are properly serialized
The references returned from query_multiple_source contain NodeWithScore objects. Ensure these are properly serialized before being sent to clients.
Run the following script to verify the serialization handling:
#!/bin/bash
# Search for NodeWithScore serialization handling
rg -A 5 "NodeWithScore.*to_(json|dict)"
# Search for any serialization utilities that might handle these objects
rg -A 5 "serialize.*NodeWithScore"
26-27
: 🛠️ Refactor suggestion
Enhance error handling for references
The error case sets answer_sources
to None while the success case returns processed references. Consider using a consistent type (empty list) for error cases to maintain API consistency.
- answer_sources = None
+ answer_sources = []
Also applies to: 36-36
Summary by CodeRabbit
Release Notes
New Features
PrepareAnswerSources
for enhanced source URL formatting based on score thresholds.CustomSubQuestionQueryEngine
to improve sub-question handling with custom querying logic.Bug Fixes
Documentation
PrepareAnswerSources
class to ensure functionality across multiple scenarios.Refactor
ask
subscriber method to include references in the response payload.