Add notebook example for Batch API #77

boqiny · 2025-01-02T05:59:18Z

User description

Description

Refactor the code in python script to Jupyter notebook for better vis
Fix minor type in ReadME
Remove unnecessary .py file.

Related Issue

Type of Change

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Documentation update
Code refactoring
Performance improvement

How Has This Been Tested?

Screenshots (if applicable)

Checklist

My code follows the project's style guidelines
I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes

Additional Notes

PR Type

Documentation, Enhancement

Description

Added two new Jupyter notebooks for Batch API examples:
- parse_batch_fetch.ipynb for fetching responses.
- parse_batch_upload.ipynb for uploading folders.
Updated the JSONL file path in parse_batch_fetch.py.
Added a TODO comment for status check in response processing.
Improved readability in the README.md documentation.

Changes walkthrough 📝

Relevant files

Enhancement

parse_batch_fetch.py `Update JSONL file path and add TODO comment` examples/parse_batch_fetch.py Updated the sample JSONL file path. Added a TODO comment for status check in response processing.	+2/-2

Documentation

README.md `Improve readability in documentation list` README.md Added a blank line for better readability in the list.	+1/-0
parse_batch_fetch.ipynb `Add notebook for batch API fetch example` examples/parse_batch_fetch.ipynb Added a new notebook for batch API fetch example. Demonstrated concurrent response processing and markdown retrieval. Included sample outputs and logging configuration.	+206/-0
parse_batch_upload.ipynb `Add notebook for batch API upload example` examples/parse_batch_upload.ipynb Added a new notebook for batch API upload example. Demonstrated folder upload and response saving. Included status checking and sample outputs.	+205/-0

💡 PR-Agent usage: Comment /help "your question" on any pull request to receive relevant information

github-actions · 2025-01-02T06:00:13Z

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 3 🔵🔵🔵⚪⚪
🧪 No relevant tests
🔒 No security concerns identified
⚡ Recommended focus areas for review Missing Status Check A TODO comment was added to include a status check in the `process_response` function, but it has not been implemented. This could lead to incomplete or incorrect processing of responses. if markdown: # TODO: add status check here Error Handling In the notebook, the `process_response` function logs errors but does not handle them in a way that ensures the program can recover or retry. This could lead to incomplete processing of responses. "def process_response(response):\n", " \"\"\"Process a single response by retrieving markdown content\"\"\"\n", " request_id = response[\"requestId\"]\n", " try:\n", " markdown = ap.batches.retrieve(request_id)\n", " if markdown:\n", " response[\"result\"] = [markdown.result[0] if markdown.result else \"\"]\n", " response[\"requestStatus\"] = \"COMPLETED\"\n", " response[\"completionTime\"] = markdown.completionTime\n", " except Exception as e:\n", " logger.error(f\"Error processing {request_id}: {str(e)}\")\n", " response[\"error\"] = [str(e)]\n", " return response"

github-actions · 2025-01-02T06:00:34Z

PR Code Suggestions ✨

Explore these optional code suggestions:

Category	Suggestion	Score
General	Add error handling for exceptions in threads during concurrent processing Add error handling for the concurrent processing block to ensure exceptions in threads do not crash the entire program. examples/parse_batch_fetch.ipynb [115-124] with ThreadPoolExecutor(max_workers=MAX_WORKER) as executor: future_to_response = { executor.submit(process_response, response): response for response in responses } updated_responses = [] for future in as_completed(future_to_response): - updated_response = future.result() - updated_responses.append(updated_response) + try: + updated_response = future.result() + updated_responses.append(updated_response) + except Exception as e: + logger.error(f"Error in thread execution: {str(e)}") Suggestion importance[1-10]: 10 Why: The suggestion introduces error handling for exceptions in threads, ensuring that the entire program does not crash due to errors in individual threads. This significantly enhances the robustness and fault tolerance of the concurrent processing block.	10
Possible issue	Validate the existence of the `response_file` before attempting to open it Add a validation step to check if the `response_file` exists before attempting to open it to avoid runtime errors. examples/parse_batch_fetch.ipynb [73-75] response_file = "./sample_data_20250102103047.jsonl" +if not os.path.exists(response_file): + raise FileNotFoundError(f"Response file {response_file} does not exist.") with open(response_file, "r") as f: responses = [json.loads(line) for line in f] Suggestion importance[1-10]: 9 Why: Adding a validation step to check if the `response_file` exists before opening it prevents runtime errors and ensures the program handles missing files gracefully. This is a critical improvement for reliability.	9
Possible issue	Ensure the `markdown` object is properly validated before processing Add a proper status check for the `markdown` object to ensure it contains valid and complete data before processing it further. examples/parse_batch_fetch.py [39-42] -if markdown: # TODO: add status check here - response["result"] = [markdown.result[0] if markdown.result else ""] +if markdown and hasattr(markdown, 'result') and markdown.result: + response["result"] = [markdown.result[0]] response["requestStatus"] = "COMPLETED" response["completionTime"] = markdown.completionTime Suggestion importance[1-10]: 8 Why: The suggestion adds a proper validation check for the `markdown` object, ensuring it contains valid and complete data before processing. This improves the robustness of the code and prevents potential runtime errors.	8

Copilot reviewed 5 out of 5 changed files in this pull request and generated no comments.

Comments suppressed due to low confidence (2)

examples/parse_batch_fetch.ipynb:90

[nitpick] The nested if statement could be simplified for better readability.

if markdown and markdown.result:

examples/parse_batch_fetch.ipynb:43

Document the purpose of the MAX_WORKER variable for better understanding.

MAX_WORKER = 10

lingjiekong

LGTM with comments

lingjiekong · 2025-01-02T16:11:15Z

examples/parse_batch_fetch.py

Shall we remove these .py file and only use ipynb?

Copilot reviewed 5 out of 5 changed files in this pull request and generated no comments.

lingjiekong

can we merge upload and fetch into a single notebook to ease the use

add notebook

a70dc0c

boqiny requested review from Sdddell, goldmermaid and lingjiekong as code owners January 2, 2025 05:59

github-actions bot added the Review effort [1-5]: 3 label Jan 2, 2025

fix fetch status check

0fd7102

lingjiekong requested a review from Copilot January 2, 2025 16:10

Copilot AI reviewed Jan 2, 2025

View reviewed changes

lingjiekong approved these changes Jan 2, 2025

View reviewed changes

lingjiekong requested a review from Copilot January 2, 2025 16:12

Copilot AI reviewed Jan 2, 2025

View reviewed changes

delete dup

dabc771

lingjiekong approved these changes Jan 2, 2025

View reviewed changes

lingjiekong requested changes Jan 2, 2025

View reviewed changes

combine two notebooks

79f38cd

lingjiekong approved these changes Jan 2, 2025

View reviewed changes

lingjiekong merged commit ca4ac5f into main Jan 2, 2025
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add notebook example for Batch API #77

Add notebook example for Batch API #77

boqiny commented Jan 2, 2025 •

edited by lingjiekong

Loading

github-actions bot commented Jan 2, 2025

github-actions bot commented Jan 2, 2025

lingjiekong left a comment

lingjiekong Jan 2, 2025

boqiny Jan 2, 2025

lingjiekong left a comment

Add notebook example for Batch API #77

Add notebook example for Batch API #77

Conversation

boqiny commented Jan 2, 2025 • edited by lingjiekong Loading

User description

Description

Related Issue

Type of Change

How Has This Been Tested?

Screenshots (if applicable)

Checklist

Additional Notes

PR Type

Description

Changes walkthrough 📝

github-actions bot commented Jan 2, 2025

PR Reviewer Guide 🔍

github-actions bot commented Jan 2, 2025

PR Code Suggestions ✨

Choose a reason for hiding this comment

lingjiekong left a comment

Choose a reason for hiding this comment

lingjiekong Jan 2, 2025

Choose a reason for hiding this comment

boqiny Jan 2, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lingjiekong left a comment

Choose a reason for hiding this comment

boqiny commented Jan 2, 2025 •

edited by lingjiekong

Loading