-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PERF-#7230: Don't preserve bad partition for merge
#68
base: cloned_main_c48bb
Are you sure you want to change the base?
PERF-#7230: Don't preserve bad partition for merge
#68
Conversation
Signed-off-by: Anatoly Myachev <[email protected]>
Clone of the PR modin-project/modin#7229 |
My review is in progress 📖 - I will have feedback for you in a few minutes! |
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (invoked as PR comments)
Additionally, you can add CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have reviewed your code and did not find any issues!
Please note that I can make mistakes, and you should still encourage your team to review your code as well.
/review |
PR Reviewer Guide 🔍
|
What do these changes do?
merge
preserves the row splitting, but sometimes it's better to repartition. One extremely bad case is when there is a sequence of heavyweight operations where inefficient partitioning persists from the first to the last. For example, sequence ofmerge
operations, where the left operand of the very first operation in the chain has only one partition. For example:Results: 5.07 sec (on main) vs 2.56 sec (in the PR)
flake8 modin/ asv_bench/benchmarks scripts/doc_checker.py
black --check modin/ asv_bench/benchmarks scripts/doc_checker.py
git commit -s
docs/development/architecture.rst
is up-to-dateDescription by Korbit AI
What change is being made?
Add logic to avoid preserving bad partitions during the
merge
operation and introduce a new test to validate this behavior.Why are these changes being made?
The change ensures that partitions are rebalanced when the ratio of existing partitions to the ideal number of partitions falls below a threshold, improving performance and avoiding empty partitions. The new test verifies that the
merge
operation correctly triggers repartitioning when necessary.