-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ray integration + Parallelism blog post #483
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❌ Changes requested. Reviewed everything up to 9e89a40 in 43 seconds
More details
- Looked at
75
lines of code in3
files - Skipped
0
files when reviewing. - Skipped posting
2
drafted comments based on config settings.
1. burr/integrations/ray.py:6
- Draft comment:
RayExecutor
should implement themap
method to fully comply with theconcurrent.futures.Executor
interface. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable:
The comment is technically correct - Executor does have a map method in its interface. However, map() is not strictly required as it has a default implementation in the base class that uses submit(). The base implementation would work fine here since submit() is properly implemented. The comment is suggesting an optimization rather than fixing a bug.
The base class implementation of map() might not be as performant as a Ray-specific implementation. There could be Ray-specific optimizations possible.
While a custom map() implementation could be more efficient, the current code is functionally correct and the base implementation will work. The comment is more of an optional optimization suggestion.
Delete this comment. While technically accurate, implementing map() is optional since the base class provides a working implementation. The comment doesn't point out a clear issue that needs to be fixed.
2. burr/tracking/client.py:366
- Draft comment:
Ensure that the filef
is properly handled during serialization/deserialization to avoid runtime errors. Consider documenting this behavior clearly. - Reason this comment was not posted:
Comment did not seem useful.
Workflow ID: wflow_JE0QkvvFdrl92wre
Want Ellipsis to fix these issues? Tag @ellipsis-dev
in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
A preview of 3d7c3e2 is uploaded and can be seen here: ✨ https://burr.dagworks.io/pull/483 ✨ Changes may take a few minutes to propagate. Since this is a preview of production, content with |
9e89a40
to
c71f34c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❌ Changes requested. Incremental review on c71f34c in 1 minute and 2 seconds
More details
- Looked at
258
lines of code in4
files - Skipped
1
files when reviewing. - Skipped posting
4
drafted comments based on config settings.
1. burr/core/persistence.py:358
- Draft comment:
Consider adding exception handling in__setstate__
to manage potential connection errors. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable:
SQLite connection errors are possible when reconnecting in setstate, and could cause issues during unpickling. However, the code follows standard SQLite connection patterns used elsewhere in the file (like in init) which don't have exception handling. Adding exception handling here but not elsewhere would be inconsistent. The comment also doesn't specify what to do with caught exceptions.
The comment identifies a real potential issue with database connections. Not handling exceptions could lead to silent failures during deserialization.
While connection errors are possible, this code follows the established pattern in the codebase. Adding exception handling only here would be inconsistent, and it's unclear what recovery action should be taken if connection fails.
Delete the comment. While it identifies a real concern, it would be inconsistent to add exception handling only here, and the comment doesn't provide clear guidance on how to handle exceptions.
2. docs/reference/persister.rst:40
- Draft comment:
.. autoclass:: burr.core.persistence.SQLitePersister
- Reason this comment was not posted:
Comment looked like it was already resolved.
3. examples/ray/application.py:98
- Draft comment:
def user_input(state: State, max_drafts: int, poem_types: List[str], poem_subject: str) -> State:
- Reason this comment was not posted:
Comment looked like it was already resolved.
4. examples/ray/application.py:108
- Draft comment:
Ensurefeedback
is consistently treated as a string or list across the codebase to avoid type errors. - Reason this comment was not posted:
Comment did not seem useful.
Workflow ID: wflow_55TXbeqaae7LtWvq
Want Ellipsis to fix these issues? Tag @ellipsis-dev
in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
@@ -553,7 +553,7 @@ const LinkSubTable = (props: { | |||
/> | |||
</TableCell> | |||
)} | |||
<TableCell colSpan={1} /> | |||
{/* <TableCell colSpan={1} /> */} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove commented-out TableCell
elements if they are not needed to keep the code clean.
Really, just for demo, unlikely to be used in production but good to have.
We need to add for the rest, this is just for the demo
c71f34c
to
f8ec214
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❌ Changes requested. Incremental review on f8ec214 in 43 seconds
More details
- Looked at
257
lines of code in3
files - Skipped
1
files when reviewing. - Skipped posting
2
drafted comments based on config settings.
1. burr/integrations/ray.py:10
- Draft comment:
Consider initializing Ray in the constructor if it's not already initialized. This will prevent runtime errors if the user forgets to initialize Ray before using the executor. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable:
The commented out code suggests this was a deliberate design choice to NOT initialize Ray in the constructor. The current design puts the responsibility of Ray initialization on the user, which could be intentional to give users more control. The error message in submit() makes it clear what went wrong.
The comment might be pointing out a legitimate usability concern - requiring users to remember to initialize Ray could lead to runtime errors.
However, the commented-out code strongly suggests this was a conscious design decision, and the clear error message provides good feedback. Forcing initialization in the constructor could limit flexibility.
Delete the comment as it appears to be suggesting something that was explicitly decided against in the implementation.
2. examples/ray/application.py:168
- Draft comment:
Using a relative path for the SQLite database can cause issues if the working directory changes. Consider using an absolute path or ensuring the working directory is consistent. - Reason this comment was not posted:
Confidence changes required:50%
In application.py, the application function uses a relative path for the SQLite database, which might cause issues if the working directory changes. It's better to use an absolute path or ensure the working directory is consistent.
Workflow ID: wflow_p61wJymjgOCJ2F7q
Want Ellipsis to fix these issues? Tag @ellipsis-dev
in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
|
||
ray_fn = ray.remote(fn) | ||
object_ref = ray_fn.remote(*args, **kwargs) | ||
future = object_ref.future() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ray's ObjectRef does not have a future()
method. Use ray.get(object_ref)
to retrieve the result instead.
# full agent | ||
def _query_llm(prompt: str) -> str: | ||
"""Simple wrapper around the OpenAI API.""" | ||
client = openai.Client() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Creating a new OpenAI client for each call can be inefficient. Consider initializing the client once and reusing it across function calls.
f8ec214
to
6603d8a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❌ Changes requested. Incremental review on 6603d8a in 55 seconds
More details
- Looked at
273
lines of code in4
files - Skipped
1
files when reviewing. - Skipped posting
3
drafted comments based on config settings.
1. burr/integrations/ray.py:8
- Draft comment:
The__init__
method is currently a placeholder. Consider initializing Ray here if needed, or remove the method if it's not required. - Reason this comment was not posted:
Confidence changes required:50%
TheRayExecutor
class inray.py
is not fully implemented. Thesubmit
method attempts to use a non-existentfuture()
method onobject_ref
. This will cause a runtime error.
2. examples/ray/application.py:17
- Draft comment:
openai.Client()
is incorrect. Useopenai.ChatCompletion.create()
instead. - Reason this comment was not posted:
Marked as duplicate.
3. examples/ray/application.py:19
- Draft comment:
client.chat.completions.create
should beopenai.ChatCompletion.create
. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable:
The comment appears to be suggesting using an older style of the OpenAI API. The code is using the newer v1.0+ style of the OpenAI Python client which usesopenai.Client()
. The comment's suggestion would make the code use the deprecated v0.x style. The current implementation is actually the preferred modern approach.
Could there be a specific reason why the older API style would be preferred in this case? Maybe there are compatibility issues I'm not aware of?
No, the newer client style is the officially recommended approach by OpenAI, and there's no indication of any special requirements that would necessitate using the legacy API.
The comment should be deleted as it suggests changing correct, modern code to use a deprecated API style.
Workflow ID: wflow_eqD7jDFTorGUYhWX
Want Ellipsis to fix these issues? Tag @ellipsis-dev
in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
|
||
ray_fn = ray.remote(fn) | ||
object_ref = ray_fn.remote(*args, **kwargs) | ||
future = object_ref.future() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
object_ref
does not have a future()
method. Use ray.get(object_ref)
to retrieve the result instead.
client = openai.Client() | ||
return ( | ||
client.chat.completions.create( | ||
model="gpt-4o", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
model="gpt-4o"
should be model="gpt-4"
.
6603d8a
to
c4350bc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Looks good to me! Incremental review on c4350bc in 45 seconds
More details
- Looked at
318
lines of code in6
files - Skipped
1
files when reviewing. - Skipped posting
3
drafted comments based on config settings.
1. burr/integrations/ray.py:12
- Draft comment:
The__init__
method is currently empty. Consider initializing necessary attributes or configurations here. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable:
The empty init appears to be intentional design. Ray initialization is handled separately via ray.init() as documented, and checked in submit(). The executor itself doesn't need instance attributes since it's just wrapping Ray remote execution. Adding initialization would likely complicate the clean design.
Maybe there are some executor-specific attributes that should be initialized? The parent Executor class might expect certain initialization.
Looking at concurrent.futures.Executor docs, it's an abstract base class that doesn't require any specific initialization. The empty init is valid and follows the interface correctly.
The comment should be deleted. The empty init is intentional and documented design, not an oversight that needs fixing.
2. burr/integrations/ray.py:25
- Draft comment:
object_ref.future()
is not a valid method in Ray. Useray.get(object_ref)
to retrieve the result of the remote function call. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable:
The comment appears to be incorrect. Whileray.get()
is a valid Ray API call, it would block and wait for the result immediately, which defeats the purpose of async execution. The Executor interface requires returning a Future object. Looking at Ray's documentation,ObjectRef.future()
is actually a valid method that converts a Ray ObjectRef to a concurrent.futures.Future, which is exactly what we need here.
I could be wrong about the Ray API - maybe there's a newer/better way to convert ObjectRefs to Futures. The documentation I'm thinking of could be outdated.
Even if there are other ways to convert ObjectRefs to Futures, the current code's approach appears valid and serves the required purpose of implementing the Executor interface correctly.
The comment should be deleted because it suggests replacing correct code with an approach that would break the Executor interface implementation.
3. burr/integrations/ray.py:32
- Draft comment:
Thewait
parameter is not respected inshutdown
. Consider implementing or documenting this behavior. - Reason this comment was not posted:
Confidence changes required:50%
Theshutdown
method inRayExecutor
does not respect thewait
parameter, which is part of theconcurrent.futures.Executor
API. This should be documented or implemented if possible.
Workflow ID: wflow_uEfA3PdNBFxML5e8
You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
+ example. We still need some unit tests, but this will do for now. Also need install target.
c4350bc
to
b66d777
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❌ Changes requested. Incremental review on b66d777 in 46 seconds
More details
- Looked at
464
lines of code in7
files - Skipped
1
files when reviewing. - Skipped posting
2
drafted comments based on config settings.
1. burr/integrations/ray.py:33
- Draft comment:
Theshutdown
method should respect thewait
parameter to align with theconcurrent.futures.Executor
interface. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable:
The comment is technically correct - the method should respect wait to properly implement the interface. However, the code already acknowledges this limitation in the docstring with "(yet)" implying it's a known issue to be addressed later. The class is also marked as "experimental" in its docstring. This suggests the author is aware and has intentionally deferred this.
The comment identifies a real interface compliance issue that could cause problems for users expecting standard Executor behavior.
While true, the issue is already clearly documented in the code itself, making the comment redundant rather than helpful.
Delete the comment since it's pointing out something that's already explicitly acknowledged in the docstring.
2. burr/integrations/ray.py:12
- Draft comment:
Consider initializing Ray within the class if not already initialized to prevent runtime errors. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable:
The current design explicitly requires manual Ray initialization, which is likely intentional since Ray initialization can have important configuration parameters that should be controlled by the user. Auto-initializing Ray could hide important configuration decisions and make the behavior less predictable. The error message is clear and helpful.
Maybe automatic initialization with default parameters would be more user-friendly and prevent runtime errors? Some libraries do auto-initialize their dependencies.
Ray initialization is an important architectural decision that often needs specific configuration. Auto-initializing would hide this decision and could cause problems if users need custom Ray settings.
The comment should be deleted. The current explicit initialization requirement is a valid design choice with clear error messaging.
Workflow ID: wflow_xGXGsmGjhTFqTmHD
Want Ellipsis to fix these issues? Tag @ellipsis-dev
in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
raise RuntimeError("Ray is not initialized. Call ray.init() before running anything!") | ||
ray_fn = ray.remote(fn) | ||
object_ref = ray_fn.remote(*args, **kwargs) | ||
future = object_ref.future() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
object_ref
does not have a future
method. Use ray.get(object_ref)
to retrieve the result instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Looks good to me! Incremental review on 5b6d89c in 23 seconds
More details
- Looked at
177
lines of code in1
files - Skipped
0
files when reviewing. - Skipped posting
3
drafted comments based on config settings.
1. telemetry/ui/src/components/routes/app/StepList.tsx:554
- Draft comment:
Unnecessary empty fragment (<></>
). Consider removing it for cleaner code. This is also applicable at lines 726 and 1161. - Reason this comment was not posted:
Confidence changes required:50%
The code contains multiple instances of unnecessary empty fragments (<></>
). These can be removed to improve code readability and performance slightly.
2. telemetry/ui/src/components/routes/app/StepList.tsx:726
- Draft comment:
Unnecessary empty fragment (<></>
). Consider removing it for cleaner code. This is also applicable at lines 554 and 1161. - Reason this comment was not posted:
Confidence changes required:50%
The code contains multiple instances of unnecessary empty fragments (<></>
). These can be removed to improve code readability and performance slightly.
3. telemetry/ui/src/components/routes/app/StepList.tsx:1161
- Draft comment:
Unnecessary empty fragment (<></>
). Consider removing it for cleaner code. This is also applicable at lines 554 and 726. - Reason this comment was not posted:
Confidence changes required:50%
The code contains multiple instances of unnecessary empty fragments (<></>
). These can be removed to improve code readability and performance slightly.
Workflow ID: wflow_0ID9BqNh8Svzx1ey
You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Looks good to me! Incremental review on 3d7c3e2 in 16 seconds
More details
- Looked at
13
lines of code in1
files - Skipped
0
files when reviewing. - Skipped posting
1
drafted comments based on config settings.
1. pyproject.toml:7
- Draft comment:
Version bump to 0.36.0 is appropriate given the new features and integrations added. - Reason this comment was not posted:
Confidence changes required:0%
The version bump from 0.35.1 to 0.36.0 in the pyproject.toml file is appropriate given the new features and integrations added in this PR.
Workflow ID: wflow_hVDe37CdvCuXvy2Q
You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
Supports blog post.
TODO:
Changes
How I tested this
Notes
Checklist
Important
Adds Ray integration for parallel execution and custom serialization methods to
LocalTrackingClient
andSQLitePersister
.RayExecutor
class inray.py
for parallel task execution using Ray.submit()
to submit tasks andshutdown()
to stop Ray.__setstate__()
and__getstate__()
methods toLocalTrackingClient
inclient.py
for custom serialization.SQLitePersister
inpersistence.py
with custom__setstate__()
and__getstate__()
methods.examples/ray/application.py
.ray.rst
and updatesindex.rst
.SQLitePersister
documentation inpersister.rst
.pyproject.toml
.load_state()
docstring inclient.py
.This description was created by for 3d7c3e2. It will automatically update as commits are pushed.