-
Notifications
You must be signed in to change notification settings - Fork 727
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
scheduler: consider leader score when evict leader #8912
Conversation
Signed-off-by: lhy1024 <[email protected]>
@@ -385,6 +378,22 @@ func scheduleEvictLeaderOnce(r *rand.Rand, name string, cluster sche.SchedulerCl | |||
return ops | |||
} | |||
|
|||
func createOperatorWithSort(name string, cluster sche.SchedulerCluster, candidates *filter.StoreCandidates, region *core.RegionInfo) (*operator.Operator, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about renaming it to CreateTransferLeaderOperatorToLowestScoreStore
and moving it to operator.go?
Signed-off-by: lhy1024 <[email protected]>
Signed-off-by: lhy1024 <[email protected]>
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #8912 +/- ##
==========================================
+ Coverage 74.91% 76.30% +1.38%
==========================================
Files 416 465 +49
Lines 42103 70554 +28451
==========================================
+ Hits 31543 53835 +22292
- Misses 7810 13368 +5558
- Partials 2750 3351 +601
Flags with carried forward coverage won't be shown. Click here to find out more. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: okJiang The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
[LGTM Timeline notifier]Timeline:
|
pkg/schedule/filter/comparer.go
Outdated
leaderSchedulePolicy := conf.GetLeaderSchedulePolicy() | ||
return func(a, b *core.StoreInfo) int { | ||
// TODO: we should use the real time delta data to calculate the score. | ||
sa := a.LeaderScore(leaderSchedulePolicy, 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The leader score is not very accurate, so it leads to the leader count of the lowest score goes up too much. How about considering the the running operators influence?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Signed-off-by: lhy1024 <[email protected]>
return NewBuilder(desc, ci, region, SkipOriginJointStateCheck). | ||
SetLeader(targetStoreID). | ||
SetLeaders(targetStoreIDs). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why remove it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do not need targetStoreIDs
, which is used in evict_leader.go
before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's an optimization, why we don't need it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SetLeaders
checks learner and unhealthy peer. https://github.com/tikv/pd/blob/master/pkg%2Fschedule%2Foperator%2Fbuilder.go#L301-L301
SetLeader
also check these. https://github.com/tikv/pd/blob/master/pkg%2Fschedule%2Foperator%2Fbuilder.go#L284-L284
SetLeaders sort target stores according to store id. This PR sort target stores according to score.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
targetLeaderStoreIDs is used previously, but removed by this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we use it in evict leader scheduler previously to select targets. It is replaced with this pr.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We allowed multiple targets in the op step before, this PR changes it which might be slower?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It only used in evict leader, after this pr there is no other scheduler using it. So I remove it.
If we will use it in the future, I will recover it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was O(n) previously and it is O(nlogn) now.
Considering that the number of sorts is usually two (three replicas) or four (five replicas), the difference is not too big.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See tikv/tikv#10602
As @rleungx said, tikv/tikv#11063 let tikv pick the fastest store to transfer leader to and may lead to leader unbalanced when some store is slow. |
What problem does this PR solve?
Issue Number: Close #8895
What is changed and how does it work?
Check List
Tests
Release note