Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce Kubernetes interface to etcd client #16333

Merged
merged 1 commit into from
Jul 23, 2024

Conversation

serathius
Copy link
Member

@serathius serathius commented Jul 31, 2023

@serathius serathius marked this pull request as draft July 31, 2023 09:11
@gritukan
Copy link

Hello!

I'm not a developer of etcd or Kubernetes, asking just out of curiosity. From what I can see in the API you've provided, it appears that Kubernates never modifies multiple keys within a transaction. Is it true?

@logicalhan
Copy link

Hello!

I'm not a developer of etcd or Kubernetes, asking just out of curiosity. From what I can see in the API you've provided, it appears that Kubernates never modifies multiple keys within a transaction. Is it true?

It's not supposed to, it breaks the watch protocol. See this doc for more details.

@gritukan
Copy link

It's not supposed to, it breaks the watch protocol. See this doc for more details.

I see, thank you!

I'm a little bit confused about it, since I've read multiple times that etcd is a bottleneck for Kubernates scalability. If there are no multi-keys transactions why isn't is possible to just partition all keys by hash for example, run multiple instances of etcd with each instance handling requests for a given subset of keys and get scalable Kubernates?

@serathius
Copy link
Member Author

I'm a little bit confused about it, since I've read multiple times that etcd is a bottleneck for Kubernates scalability. If there are no multi-keys transactions why isn't is possible to just partition all keys by hash for example, run multiple instances of etcd with each instance handling requests for a given subset of keys and get scalable Kubernates?

It's not, K8s scalability limits are mostly hit by badly written operators that do not use proper API machinery. People are sharding etcd under Kubernetes (for example events are frequently separated due their different nature event vs state).

Note; This is an PR and not a discussion about the K8s API design. Please leave any follow up questions on this document. Any further questions will be removed to avoid cluttering the PR comments

Copy link

stale bot commented Mar 17, 2024

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 21 days if no further activity occurs. Thank you for your contributions.

@serathius
Copy link
Member Author

Doing a second iteration based on kubernetes/kubernetes#125258

ping @ahrtr for feedback

@serathius
Copy link
Member Author

/hold

@serathius serathius force-pushed the kubernetes branch 3 times, most recently from 8d42d04 to 8fa70e1 Compare July 5, 2024 10:25
@serathius
Copy link
Member Author

/retest

@serathius
Copy link
Member Author

Also please add an unit test file (kubernetes_test.go), which can be done in a followup PR if you want.

I'm testing this PR by running K8s tests on it in kubernetes/kubernetes#125258. Would prefer to defer unit test to separate PR to ensure we are compatible with K8s first.

return resp, nil
}

func (k Client) optimisticTxn(ctx context.Context, key string, expectRevision int64, onSuccess, onFailure *pb.RequestOp) (*pb.TxnResponse, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
func (k Client) optimisticTxn(ctx context.Context, key string, expectRevision int64, onSuccess, onFailure *pb.RequestOp) (*pb.TxnResponse, error) {
func (k Client) optimisticTxn(ctx context.Context, key string, expectedRevision int64, onSuccess, onFailure *pb.RequestOp) (*pb.TxnResponse, error) {

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Member

@ahrtr ahrtr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with a minor comment.

cc @fuweid @ivanvc and @jmhbnz to take a look as well.

@k8s-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ahrtr, serathius

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ivanvc ivanvc removed the github_actions Pull requests that update GitHub Actions code label Jul 12, 2024
@serathius
Copy link
Member Author

Would like to get review from SIG-api-machinery too.
cc @jpbetz @wojtek-t @logicalhan

client/v3/kubernetes/interface.go Outdated Show resolved Hide resolved
client/v3/kubernetes/interface.go Show resolved Hide resolved
client/v3/kubernetes/interface.go Outdated Show resolved Hide resolved
client/v3/kubernetes/client.go Show resolved Hide resolved
client/v3/kubernetes/client.go Outdated Show resolved Hide resolved
client/v3/kubernetes/interface.go Outdated Show resolved Hide resolved
//
// If opts.GetOnFailure is true, the modified key-value pair will be returned if the put operation fails due to a revision mismatch.
// If opts.LeaseID is provided, it overrides the lease associated with the key. If not provided, the existing lease is cleared.
OptimisticPut(ctx context.Context, key string, value []byte, expectedRevision int64, opts PutOptions) (PutResponse, error)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential alternative designs:

  • Just use Put for this, but accept expectedRevision as an optional option in Put, maybe in a precondition section.
  • Put expectedRevision in an OptimisticPutOptions here. (OptimisticPutOptions could embed PutOptions and extend it...)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kubernetes doesn't do Put without condition, and don't think it ever will. Don't understand why we would separate it.
Considered having expectedRevision in options, however I consider options, being optional while expectedRevision needs to be always provided because even zero has a semantic meaning.

client/v3/kubernetes/interface.go Show resolved Hide resolved
client/v3/kubernetes/interface.go Show resolved Hide resolved
client/v3/kubernetes/interface.go Outdated Show resolved Hide resolved
client/v3/kubernetes/interface.go Outdated Show resolved Hide resolved
@serathius serathius force-pushed the kubernetes branch 2 times, most recently from ad516c3 to a26d984 Compare July 13, 2024 18:41
@codecov-commenter
Copy link

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

Attention: Patch coverage is 21.33333% with 118 lines in your changes missing coverage. Please review.

Project coverage is 68.71%. Comparing base (c9fdc60) to head (cd3536e).
Report is 19 commits behind head on main.

Current head cd3536e differs from pull request most recent head a26d984

Please upload reports for the commit a26d984 to get more accurate results.

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
Files Coverage Δ
client/v3/client.go 84.89% <100.00%> (ø)
client/v3/kv.go 94.44% <100.00%> (ø)
client/v3/txn.go 100.00% <100.00%> (ø)
client/v3/watch.go 93.83% <100.00%> (ø)
client/v3/cluster.go 96.00% <60.00%> (ø)
client/v3/auth.go 68.49% <66.66%> (ø)
client/v3/lease.go 90.87% <40.00%> (ø)
client/v3/maintenance.go 58.55% <13.33%> (ø)
client/v3/kubernetes/client.go 0.00% <0.00%> (ø)

... and 409 files with indirect coverage changes

@@            Coverage Diff            @@
##           main   #16333       +/-   ##
=========================================
+ Coverage      0   68.71%   +68.71%     
=========================================
  Files         0      418      +418     
  Lines         0    35423    +35423     
=========================================
+ Hits          0    24341    +24341     
- Misses        0     9665     +9665     
- Partials      0     1417     +1417     

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c9fdc60...a26d984. Read the comment docs.

@serathius
Copy link
Member Author

serathius commented Jul 23, 2024

Fixed the comments, please take another look @jpbetz @deads2k @wojtek-t

@serathius
Copy link
Member Author

/retest

@serathius
Copy link
Member Author

/retest

@serathius serathius merged commit 24ff469 into etcd-io:main Jul 23, 2024
43 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

10 participants