Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Fix slow dist handle and slow observe #38566

Merged
merged 24 commits into from
Jan 15, 2025

Conversation

bigsheeper
Copy link
Contributor

@bigsheeper bigsheeper commented Dec 18, 2024

  1. Provide partition&channel level indexing in the collection target.
  2. Make SegmentAction not wait for distribution.
  3. Remove scheduler and target manager mutex.
  4. Optimize logging to reduce CPU overhead.

issue: #37630

@sre-ci-robot sre-ci-robot added the size/L Denotes a PR that changes 100-499 lines. label Dec 18, 2024
@mergify mergify bot added dco-passed DCO check passed. kind/bug Issues or changes related a bug labels Dec 18, 2024
@bigsheeper
Copy link
Contributor Author

slow dist handling:
kZ2dj7hDZj

slow observation:
7h7mK3zFij

czs007 pushed a commit that referenced this pull request Dec 18, 2024
1. Provide partition-level indexing in the collection target.
2. Make SegmentAction not wait for distribution.
3. Optimize logging to reduce CPU overhead.

issue: #37630

pr: #38566

---------

Signed-off-by: bigsheeper <[email protected]>
Copy link

codecov bot commented Dec 18, 2024

Codecov Report

Attention: Patch coverage is 91.73554% with 20 lines in your changes missing coverage. Please review.

Project coverage is 81.08%. Comparing base (3a6408b) to head (e76bcab).
Report is 13 commits behind head on master.

Files with missing lines Patch % Lines
internal/querycoordv2/task/scheduler.go 88.72% 13 Missing and 2 partials ⚠️
...rnal/querycoordv2/observers/collection_observer.go 83.33% 4 Missing ⚠️
internal/querycoordv2/meta/target.go 98.30% 0 Missing and 1 partial ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #38566      +/-   ##
==========================================
+ Coverage   81.07%   81.08%   +0.01%     
==========================================
  Files        1404     1404              
  Lines      198272   198238      -34     
==========================================
- Hits       160752   160749       -3     
+ Misses      31867    31848      -19     
+ Partials     5653     5641      -12     
Components Coverage Δ
Client 79.50% <ø> (ø)
Core 69.65% <ø> (ø)
Go 83.01% <91.73%> (+0.01%) ⬆️
Files with missing lines Coverage Δ
internal/querycoordv2/dist/dist_handler.go 96.13% <100.00%> (+0.10%) ⬆️
internal/querycoordv2/meta/target_manager.go 87.93% <100.00%> (-0.64%) ⬇️
internal/querycoordv2/task/action.go 96.66% <100.00%> (+2.17%) ⬆️
internal/querycoordv2/task/task.go 94.22% <100.00%> (+0.05%) ⬆️
internal/querycoordv2/utils/util.go 85.54% <100.00%> (ø)
pkg/metrics/querycoord_metrics.go 100.00% <ø> (ø)
internal/querycoordv2/meta/target.go 93.06% <98.30%> (+2.01%) ⬆️
...rnal/querycoordv2/observers/collection_observer.go 86.52% <83.33%> (-0.39%) ⬇️
internal/querycoordv2/task/scheduler.go 88.47% <88.72%> (+1.16%) ⬆️

... and 25 files with indirect coverage changes

jaime0815 pushed a commit that referenced this pull request Dec 19, 2024
Print observe, dist handing and schedule time.

issue: #37630

pr: #38566

Signed-off-by: bigsheeper <[email protected]>
Copy link
Contributor

mergify bot commented Dec 19, 2024

@bigsheeper E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

Signed-off-by: bigsheeper <[email protected]>
Signed-off-by: bigsheeper <[email protected]>
czs007 pushed a commit that referenced this pull request Dec 26, 2024
Copy link
Contributor

mergify bot commented Dec 26, 2024

@bigsheeper go-sdk check failed, comment rerun go-sdk can trigger the job again.

Copy link
Contributor

mergify bot commented Dec 26, 2024

@bigsheeper cpp-unit-test check failed, comment rerun cpp-unit-test can trigger the job again.

Copy link
Contributor

mergify bot commented Dec 27, 2024

@bigsheeper E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

Copy link
Contributor

mergify bot commented Dec 27, 2024

@bigsheeper go-sdk check failed, comment rerun go-sdk can trigger the job again.

Copy link
Contributor

mergify bot commented Dec 27, 2024

@bigsheeper cpp-unit-test check failed, comment rerun cpp-unit-test can trigger the job again.

Copy link
Contributor

mergify bot commented Dec 27, 2024

@bigsheeper E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

@bigsheeper
Copy link
Contributor Author

rerun go-sdk

@bigsheeper
Copy link
Contributor Author

/run-cpu-e2e

@mergify mergify bot added the ci-passed label Jan 8, 2025
@weiliu1031
Copy link
Contributor

/lgtm

Signed-off-by: bigsheeper <[email protected]>
Signed-off-by: bigsheeper <[email protected]>
@sre-ci-robot sre-ci-robot removed the lgtm label Jan 14, 2025
@mergify mergify bot removed the ci-passed label Jan 14, 2025
Copy link
Contributor

mergify bot commented Jan 14, 2025

@bigsheeper E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

@czs007 czs007 assigned czs007 and unassigned weiliu1031 Jan 14, 2025
@czs007
Copy link
Collaborator

czs007 commented Jan 14, 2025

/approve

@sre-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bigsheeper, czs007

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@bigsheeper
Copy link
Contributor Author

/run-cpu-e2e

@bigsheeper
Copy link
Contributor Author

rerun ut

Copy link
Contributor

mergify bot commented Jan 15, 2025

@bigsheeper go-sdk check failed, comment rerun go-sdk can trigger the job again.

@bigsheeper
Copy link
Contributor Author

rerun go-sdk

@mergify mergify bot added the ci-passed label Jan 15, 2025
@czs007
Copy link
Collaborator

czs007 commented Jan 15, 2025

/lgtm

@sre-ci-robot sre-ci-robot merged commit 657550c into milvus-io:master Jan 15, 2025
20 checks passed
sre-ci-robot pushed a commit that referenced this pull request Jan 16, 2025
1. Provide partition&channel level indexing in the collection target.
2. Make SegmentAction not wait for distribution.
3. Remove scheduler and target manager mutex
4. Optimize logging to reduce CPU overhead.

issue: #37630

pr: #38566

---------

Signed-off-by: bigsheeper <[email protected]>
gifi-siby pushed a commit to gifi-siby/milvus that referenced this pull request Jan 16, 2025
1. Provide partition&channel level indexing in the collection target.
2. Make `SegmentAction` not wait for distribution.
3. Remove scheduler and target manager mutex.
4. Optimize logging to reduce CPU overhead.

issue: milvus-io#37630

---------

Signed-off-by: bigsheeper <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved ci-passed dco-passed DCO check passed. kind/bug Issues or changes related a bug lgtm size/XL Denotes a PR that changes 500-999 lines.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants