Ray integration + Parallelism blog post #483

elijahbenizzy · 2024-12-18T23:22:47Z

Supports blog post.

TODO:

Add ray install target
Add ray docs for integration
Add ray tests (OK afterwards)

Changes

How I tested this

Notes

Checklist

PR has an informative and human-readable title (this will be pulled into the release notes)
Changes are limited to a single goal (no scope creep)
Code passed the pre-commit check & code is left cleaner/nicer than when first encountered.
Any change in functionality is tested
New functions are documented (with a description, list of inputs, and expected output)
Placeholder code is flagged / future TODOs are captured in comments
Project documentation has been updated if adding/changing functionality.

Important

Adds Ray integration for parallel execution and custom serialization methods to LocalTrackingClient and SQLitePersister.

Integration:
- Adds RayExecutor class in ray.py for parallel task execution using Ray.
- Implements submit() to submit tasks and shutdown() to stop Ray.
Serialization:
- Adds __setstate__() and __getstate__() methods to LocalTrackingClient in client.py for custom serialization.
- Updates SQLitePersister in persistence.py with custom __setstate__() and __getstate__() methods.
Examples:
- Adds example code for Ray integration in examples/ray/application.py.
Documentation:
- Adds Ray integration documentation in ray.rst and updates index.rst.
- Updates SQLitePersister documentation in persister.rst.
Misc:
- Adds Ray to dependencies in pyproject.toml.
- Fixes typo in load_state() docstring in client.py.

^{This description was created by}^{for 3d7c3e2. It will automatically update as commits are pushed.}

ellipsis-dev

❌ Changes requested. Reviewed everything up to 9e89a40 in 43 seconds

More details

Looked at 75 lines of code in 3 files
Skipped 0 files when reviewing.
Skipped posting 2 drafted comments based on config settings.

1. burr/integrations/ray.py:6

Draft comment:
RayExecutor should implement the map method to fully comply with the concurrent.futures.Executor interface.
Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable:
The comment is technically correct - Executor does have a map method in its interface. However, map() is not strictly required as it has a default implementation in the base class that uses submit(). The base implementation would work fine here since submit() is properly implemented. The comment is suggesting an optimization rather than fixing a bug.
The base class implementation of map() might not be as performant as a Ray-specific implementation. There could be Ray-specific optimizations possible.
While a custom map() implementation could be more efficient, the current code is functionally correct and the base implementation will work. The comment is more of an optional optimization suggestion.
Delete this comment. While technically accurate, implementing map() is optional since the base class provides a working implementation. The comment doesn't point out a clear issue that needs to be fixed.

2. burr/tracking/client.py:366

Draft comment:
Ensure that the file f is properly handled during serialization/deserialization to avoid runtime errors. Consider documenting this behavior clearly.
Reason this comment was not posted:
Comment did not seem useful.

Workflow ID: wflow_JE0QkvvFdrl92wre

Want Ellipsis to fix these issues? Tag @ellipsis-dev in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

burr/integrations/ray.py

github-actions · 2024-12-18T23:24:11Z

A preview of 3d7c3e2 is uploaded and can be seen here:

✨ https://burr.dagworks.io/pull/483 ✨

Changes may take a few minutes to propagate. Since this is a preview of production, content with draft: true will not be rendered. The source is here: https://github.com/DAGWorks-Inc/burr/tree/gh-pages/pull/483/

ellipsis-dev

❌ Changes requested. Incremental review on c71f34c in 1 minute and 2 seconds

More details

Looked at 258 lines of code in 4 files
Skipped 1 files when reviewing.
Skipped posting 4 drafted comments based on config settings.

1. burr/core/persistence.py:358

Draft comment:
Consider adding exception handling in __setstate__ to manage potential connection errors.
Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable:
SQLite connection errors are possible when reconnecting in setstate, and could cause issues during unpickling. However, the code follows standard SQLite connection patterns used elsewhere in the file (like in init) which don't have exception handling. Adding exception handling here but not elsewhere would be inconsistent. The comment also doesn't specify what to do with caught exceptions.
The comment identifies a real potential issue with database connections. Not handling exceptions could lead to silent failures during deserialization.
While connection errors are possible, this code follows the established pattern in the codebase. Adding exception handling only here would be inconsistent, and it's unclear what recovery action should be taken if connection fails.
Delete the comment. While it identifies a real concern, it would be inconsistent to add exception handling only here, and the comment doesn't provide clear guidance on how to handle exceptions.

2. docs/reference/persister.rst:40

Draft comment:

.. autoclass:: burr.core.persistence.SQLitePersister

Reason this comment was not posted:
Comment looked like it was already resolved.

3. examples/ray/application.py:98

Draft comment:

def user_input(state: State, max_drafts: int, poem_types: List[str], poem_subject: str) -> State:

Reason this comment was not posted:
Comment looked like it was already resolved.

4. examples/ray/application.py:108

Draft comment:
Ensure feedback is consistently treated as a string or list across the codebase to avoid type errors.
Reason this comment was not posted:
Comment did not seem useful.

Workflow ID: wflow_55TXbeqaae7LtWvq

Want Ellipsis to fix these issues? Tag @ellipsis-dev in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

ellipsis-dev · 2024-12-23T04:43:56Z

telemetry/ui/src/components/routes/app/StepList.tsx

@@ -553,7 +553,7 @@ const LinkSubTable = (props: {
                  />
                </TableCell>
              )}
-              <TableCell colSpan={1} />
+              {/* <TableCell colSpan={1} /> */}


Remove commented-out TableCell elements if they are not needed to keep the code clean.

Really, just for demo, unlikely to be used in production but good to have.

We need to add for the rest, this is just for the demo

ellipsis-dev

❌ Changes requested. Incremental review on f8ec214 in 43 seconds

More details

Looked at 257 lines of code in 3 files
Skipped 1 files when reviewing.
Skipped posting 2 drafted comments based on config settings.

1. burr/integrations/ray.py:10

Draft comment:
Consider initializing Ray in the constructor if it's not already initialized. This will prevent runtime errors if the user forgets to initialize Ray before using the executor.
Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable:
The commented out code suggests this was a deliberate design choice to NOT initialize Ray in the constructor. The current design puts the responsibility of Ray initialization on the user, which could be intentional to give users more control. The error message in submit() makes it clear what went wrong.
The comment might be pointing out a legitimate usability concern - requiring users to remember to initialize Ray could lead to runtime errors.
However, the commented-out code strongly suggests this was a conscious design decision, and the clear error message provides good feedback. Forcing initialization in the constructor could limit flexibility.
Delete the comment as it appears to be suggesting something that was explicitly decided against in the implementation.

2. examples/ray/application.py:168

Draft comment:
Using a relative path for the SQLite database can cause issues if the working directory changes. Consider using an absolute path or ensuring the working directory is consistent.
Reason this comment was not posted:
Confidence changes required: 50%
In application.py, the application function uses a relative path for the SQLite database, which might cause issues if the working directory changes. It's better to use an absolute path or ensure the working directory is consistent.

Workflow ID: wflow_p61wJymjgOCJ2F7q

Want Ellipsis to fix these issues? Tag @ellipsis-dev in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

ellipsis-dev · 2024-12-23T04:55:08Z

burr/integrations/ray.py

+
+        ray_fn = ray.remote(fn)
+        object_ref = ray_fn.remote(*args, **kwargs)
+        future = object_ref.future()


Ray's ObjectRef does not have a future() method. Use ray.get(object_ref) to retrieve the result instead.

ellipsis-dev · 2024-12-23T04:55:09Z

examples/ray/application.py

+# full agent
+def _query_llm(prompt: str) -> str:
+    """Simple wrapper around the OpenAI API."""
+    client = openai.Client()


Creating a new OpenAI client for each call can be inefficient. Consider initializing the client once and reusing it across function calls.

ellipsis-dev

❌ Changes requested. Incremental review on 6603d8a in 55 seconds

More details

Looked at 273 lines of code in 4 files
Skipped 1 files when reviewing.
Skipped posting 3 drafted comments based on config settings.

1. burr/integrations/ray.py:8

Draft comment:
The __init__ method is currently a placeholder. Consider initializing Ray here if needed, or remove the method if it's not required.
Reason this comment was not posted:
Confidence changes required: 50%
The RayExecutor class in ray.py is not fully implemented. The submit method attempts to use a non-existent future() method on object_ref. This will cause a runtime error.

2. examples/ray/application.py:17

Draft comment:
openai.Client() is incorrect. Use openai.ChatCompletion.create() instead.
Reason this comment was not posted:
Marked as duplicate.

3. examples/ray/application.py:19

Draft comment:
client.chat.completions.create should be openai.ChatCompletion.create.
Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable:
The comment appears to be suggesting using an older style of the OpenAI API. The code is using the newer v1.0+ style of the OpenAI Python client which uses openai.Client(). The comment's suggestion would make the code use the deprecated v0.x style. The current implementation is actually the preferred modern approach.
Could there be a specific reason why the older API style would be preferred in this case? Maybe there are compatibility issues I'm not aware of?
No, the newer client style is the officially recommended approach by OpenAI, and there's no indication of any special requirements that would necessitate using the legacy API.
The comment should be deleted as it suggests changing correct, modern code to use a deprecated API style.

Workflow ID: wflow_eqD7jDFTorGUYhWX

Want Ellipsis to fix these issues? Tag @ellipsis-dev in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

ellipsis-dev · 2024-12-23T04:56:56Z

burr/integrations/ray.py

+
+        ray_fn = ray.remote(fn)
+        object_ref = ray_fn.remote(*args, **kwargs)
+        future = object_ref.future()


object_ref does not have a future() method. Use ray.get(object_ref) to retrieve the result instead.

ellipsis-dev · 2024-12-23T04:56:56Z

examples/ray/application.py

+    client = openai.Client()
+    return (
+        client.chat.completions.create(
+            model="gpt-4o",


model="gpt-4o" should be model="gpt-4".

ellipsis-dev

👍 Looks good to me! Incremental review on c4350bc in 45 seconds

More details

Looked at 318 lines of code in 6 files
Skipped 1 files when reviewing.
Skipped posting 3 drafted comments based on config settings.

1. burr/integrations/ray.py:12

Draft comment:
The __init__ method is currently empty. Consider initializing necessary attributes or configurations here.
Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable:
The empty init appears to be intentional design. Ray initialization is handled separately via ray.init() as documented, and checked in submit(). The executor itself doesn't need instance attributes since it's just wrapping Ray remote execution. Adding initialization would likely complicate the clean design.
Maybe there are some executor-specific attributes that should be initialized? The parent Executor class might expect certain initialization.
Looking at concurrent.futures.Executor docs, it's an abstract base class that doesn't require any specific initialization. The empty init is valid and follows the interface correctly.
The comment should be deleted. The empty init is intentional and documented design, not an oversight that needs fixing.

2. burr/integrations/ray.py:25

Draft comment:
object_ref.future() is not a valid method in Ray. Use ray.get(object_ref) to retrieve the result of the remote function call.
Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable:
The comment appears to be incorrect. While ray.get() is a valid Ray API call, it would block and wait for the result immediately, which defeats the purpose of async execution. The Executor interface requires returning a Future object. Looking at Ray's documentation, ObjectRef.future() is actually a valid method that converts a Ray ObjectRef to a concurrent.futures.Future, which is exactly what we need here.
I could be wrong about the Ray API - maybe there's a newer/better way to convert ObjectRefs to Futures. The documentation I'm thinking of could be outdated.
Even if there are other ways to convert ObjectRefs to Futures, the current code's approach appears valid and serves the required purpose of implementing the Executor interface correctly.
The comment should be deleted because it suggests replacing correct code with an approach that would break the Executor interface implementation.

3. burr/integrations/ray.py:32

Draft comment:
The wait parameter is not respected in shutdown. Consider implementing or documenting this behavior.
Reason this comment was not posted:
Confidence changes required: 50%
The shutdown method in RayExecutor does not respect the wait parameter, which is part of the concurrent.futures.Executor API. This should be documented or implemented if possible.

Workflow ID: wflow_uEfA3PdNBFxML5e8

You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

+ example. We still need some unit tests, but this will do for now. Also need install target.

ellipsis-dev

❌ Changes requested. Incremental review on b66d777 in 46 seconds

More details

Looked at 464 lines of code in 7 files
Skipped 1 files when reviewing.
Skipped posting 2 drafted comments based on config settings.

1. burr/integrations/ray.py:33

Draft comment:
The shutdown method should respect the wait parameter to align with the concurrent.futures.Executor interface.
Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable:
The comment is technically correct - the method should respect wait to properly implement the interface. However, the code already acknowledges this limitation in the docstring with "(yet)" implying it's a known issue to be addressed later. The class is also marked as "experimental" in its docstring. This suggests the author is aware and has intentionally deferred this.
The comment identifies a real interface compliance issue that could cause problems for users expecting standard Executor behavior.
While true, the issue is already clearly documented in the code itself, making the comment redundant rather than helpful.
Delete the comment since it's pointing out something that's already explicitly acknowledged in the docstring.

2. burr/integrations/ray.py:12

Draft comment:
Consider initializing Ray within the class if not already initialized to prevent runtime errors.
Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable:
The current design explicitly requires manual Ray initialization, which is likely intentional since Ray initialization can have important configuration parameters that should be controlled by the user. Auto-initializing Ray could hide important configuration decisions and make the behavior less predictable. The error message is clear and helpful.
Maybe automatic initialization with default parameters would be more user-friendly and prevent runtime errors? Some libraries do auto-initialize their dependencies.
Ray initialization is an important architectural decision that often needs specific configuration. Auto-initializing would hide this decision and could cause problems if users need custom Ray settings.
The comment should be deleted. The current explicit initialization requirement is a valid design choice with clear error messaging.

Workflow ID: wflow_xGXGsmGjhTFqTmHD

Want Ellipsis to fix these issues? Tag @ellipsis-dev in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

ellipsis-dev · 2024-12-23T18:41:36Z

burr/integrations/ray.py

+            raise RuntimeError("Ray is not initialized. Call ray.init() before running anything!")
+        ray_fn = ray.remote(fn)
+        object_ref = ray_fn.remote(*args, **kwargs)
+        future = object_ref.future()


object_ref does not have a future method. Use ray.get(object_ref) to retrieve the result instead.

ellipsis-dev

👍 Looks good to me! Incremental review on 5b6d89c in 23 seconds

More details

Looked at 177 lines of code in 1 files
Skipped 0 files when reviewing.
Skipped posting 3 drafted comments based on config settings.

1. telemetry/ui/src/components/routes/app/StepList.tsx:554

Draft comment:
Unnecessary empty fragment (<></>). Consider removing it for cleaner code. This is also applicable at lines 726 and 1161.
Reason this comment was not posted:
Confidence changes required: 50%
The code contains multiple instances of unnecessary empty fragments (<></>). These can be removed to improve code readability and performance slightly.

2. telemetry/ui/src/components/routes/app/StepList.tsx:726

Draft comment:
Unnecessary empty fragment (<></>). Consider removing it for cleaner code. This is also applicable at lines 554 and 1161.
Reason this comment was not posted:
Confidence changes required: 50%
The code contains multiple instances of unnecessary empty fragments (<></>). These can be removed to improve code readability and performance slightly.

3. telemetry/ui/src/components/routes/app/StepList.tsx:1161

Draft comment:
Unnecessary empty fragment (<></>). Consider removing it for cleaner code. This is also applicable at lines 554 and 726.
Reason this comment was not posted:
Confidence changes required: 50%
The code contains multiple instances of unnecessary empty fragments (<></>). These can be removed to improve code readability and performance slightly.

Workflow ID: wflow_0ID9BqNh8Svzx1ey

You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

ellipsis-dev

👍 Looks good to me! Incremental review on 3d7c3e2 in 16 seconds

More details

Looked at 13 lines of code in 1 files
Skipped 0 files when reviewing.
Skipped posting 1 drafted comments based on config settings.

1. pyproject.toml:7

Draft comment:
Version bump to 0.36.0 is appropriate given the new features and integrations added.
Reason this comment was not posted:
Confidence changes required: 0%
The version bump from 0.35.1 to 0.36.0 in the pyproject.toml file is appropriate given the new features and integrations added in this PR.

Workflow ID: wflow_hVDe37CdvCuXvy2Q

You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

ellipsis-dev bot reviewed Dec 18, 2024

View reviewed changes

burr/integrations/ray.py Show resolved Hide resolved

elijahbenizzy force-pushed the ray-integration branch from 9e89a40 to c71f34c Compare December 23, 2024 04:42

ellipsis-dev bot reviewed Dec 23, 2024

View reviewed changes

elijahbenizzy added 4 commits December 22, 2024 20:54

Updates issues in which width of columns for step list was not correct

f52f8d6

Fixes typos in persister docs

ff34eb6

Adds serialization capability for local tracker

405db4c

Really, just for demo, unlikely to be used in production but good to have.

Adds serialization function for sqlite persister

e0409fd

We need to add for the rest, this is just for the demo

elijahbenizzy force-pushed the ray-integration branch from c71f34c to f8ec214 Compare December 23, 2024 04:54

elijahbenizzy changed the title ~~WIP for ray integration/parallelism~~ Ray integration + Parallelism blog post Dec 23, 2024

ellipsis-dev bot reviewed Dec 23, 2024

View reviewed changes

elijahbenizzy force-pushed the ray-integration branch from f8ec214 to 6603d8a Compare December 23, 2024 04:55

ellipsis-dev bot reviewed Dec 23, 2024

View reviewed changes

elijahbenizzy force-pushed the ray-integration branch from 6603d8a to c4350bc Compare December 23, 2024 05:06

ellipsis-dev bot reviewed Dec 23, 2024

View reviewed changes

Adds Ray integration

b66d777

+ example. We still need some unit tests, but this will do for now. Also need install target.

elijahbenizzy force-pushed the ray-integration branch from c4350bc to b66d777 Compare December 23, 2024 18:40

ellipsis-dev bot reviewed Dec 23, 2024

View reviewed changes

elijahbenizzy requested a review from skrawcz December 23, 2024 18:52

Multiple fixes for column width

5b6d89c

ellipsis-dev bot reviewed Dec 23, 2024

View reviewed changes

Bumps version from 0.35.1 to 0.36.0

3d7c3e2

ellipsis-dev bot reviewed Dec 23, 2024

View reviewed changes

skrawcz approved these changes Dec 23, 2024

View reviewed changes

elijahbenizzy merged commit 402623e into main Dec 23, 2024
12 checks passed

elijahbenizzy deleted the ray-integration branch December 23, 2024 20:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ray integration + Parallelism blog post #483

Ray integration + Parallelism blog post #483

elijahbenizzy commented Dec 18, 2024 •

edited by ellipsis-dev bot

Loading

ellipsis-dev bot left a comment

github-actions bot commented Dec 18, 2024 •

edited

Loading

ellipsis-dev bot left a comment

ellipsis-dev bot Dec 23, 2024

ellipsis-dev bot left a comment

ellipsis-dev bot Dec 23, 2024

ellipsis-dev bot Dec 23, 2024

ellipsis-dev bot left a comment

ellipsis-dev bot Dec 23, 2024

ellipsis-dev bot Dec 23, 2024

ellipsis-dev bot left a comment

ellipsis-dev bot left a comment

ellipsis-dev bot Dec 23, 2024

ellipsis-dev bot left a comment

ellipsis-dev bot left a comment

Ray integration + Parallelism blog post #483

Ray integration + Parallelism blog post #483

Conversation

elijahbenizzy commented Dec 18, 2024 • edited by ellipsis-dev bot Loading

Changes

How I tested this

Notes

Checklist

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

github-actions bot commented Dec 18, 2024 • edited Loading

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

ellipsis-dev bot Dec 23, 2024

Choose a reason for hiding this comment

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

ellipsis-dev bot Dec 23, 2024

Choose a reason for hiding this comment

ellipsis-dev bot Dec 23, 2024

Choose a reason for hiding this comment

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

ellipsis-dev bot Dec 23, 2024

Choose a reason for hiding this comment

ellipsis-dev bot Dec 23, 2024

Choose a reason for hiding this comment

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

ellipsis-dev bot Dec 23, 2024

Choose a reason for hiding this comment

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

elijahbenizzy commented Dec 18, 2024 •

edited by ellipsis-dev bot

Loading

github-actions bot commented Dec 18, 2024 •

edited

Loading