-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[COST-5427] Utilize an initial cte to optimize the distribution sql. #5332
Conversation
/retest |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #5332 +/- ##
=====================================
Coverage 94.0% 94.0%
=====================================
Files 375 375
Lines 31616 31616
Branches 4658 4658
=====================================
Hits 29730 29730
Misses 1205 1205
Partials 681 681 |
Successful smoke tests: https://ci.ext.devshift.net/job/koku-pipeline-pr-check-main/3717/testReport/ |
I tried loading some large customer data for this but I don't think I can scale large enough to really see the impact. That said based on the explains its certainly not worse! I say we roll with it and take the improvement! |
Jira Ticket
COST-5427
Description
The introduction of the cte_narrow_dataset allows us to run all subsequent calculations on a reduced dataset instead of performing multiple scans of the daily summary table (which can be large for customers with many clusters). The new "narrow" cte allows us to remove redundancy across the other CTEs. The filtered and joined data is prepared once and reused instead of performing the same joins and filters multiple times on the daily summary table.
Analyze Results:
Old Query
New Query:
Testing
Release Notes