-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Data] Fix OutputBlockBuffer
to avoid repeatedly copying remainder block
#48266
base: master
Are you sure you want to change the base?
Conversation
Great find! So my understanding is that .slice here ray/python/ray/data/_internal/arrow_block.py Line 284 in 22e266a
|
There are some test failure but this seems like the right fix :) |
Signed-off-by: Alexey Kudinkin <[email protected]>
Signed-off-by: Alexey Kudinkin <[email protected]>
Signed-off-by: Alexey Kudinkin <[email protected]>
b810d01
to
314dbba
Compare
Correct. And we'd avoid copying here as well as it's absolutely unnecessary. |
d2d43e4
to
d795904
Compare
@@ -1,5 +1,8 @@ | |||
from typing import Any | |||
|
|||
import pyarrow | |||
from mdit_py_plugins.myst_blocks.index import target |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch
if isinstance(block_to_yield, pyarrow.Table): | ||
print(f">>> [DBG] yielding block: {block_to_yield.num_rows}, remaining block: {block_remainder.num_rows if block_remainder else -1}; " | ||
f"target num rows: {target_num_rows}, target max block size: {self._target_max_block_size}, num bytes per row: {num_bytes_per_row}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Debugging
|
||
print(f">>> [DBG] test_block_slicing {ds._plan}") | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, will remove once i figure out tests failing only in CI but not locally
Signed-off-by: Alexey Kudinkin <[email protected]>
d795904
to
2037198
Compare
Why are these changes needed?
Currently, inside
OutputBlockBuffer
we'reThis change addresses both of these issues, by establishing following protocol where
Addresses #48236
Related issue number
Checks
git commit -s
) in this PR.scripts/format.sh
to lint the changes in this PR.method in Tune, I've added it in
doc/source/tune/api/
under thecorresponding
.rst
file.