-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
/members
peer handler is not guarded with linearizable check
#16687
Comments
/cc @ahrtr |
/members
peer handler with linearizable check/members
peer handler is not guarded with linearizable check
QQ, why we have a non-grpc endpoint for listing members? Can someone do some archaeology because for me it seems like legacy from v2 API. |
From what I can tell, it is used in runtime reconfiguration (adding a new member). etcd/server/etcdserver/bootstrap.go Line 289 in 0f4d7a7
When the new member process starts, it needs to verify the existing cluster ID matches the local |
This looks like code used to bootstrap member joining a existing cluster. For example when increasing cluster size. From brief look, call to members is used to get mapping between member names and ids. I would need to double check how fetching of those ids are later synchronized with raft, but I suspect that getting the freshest data might not be necessary as long as raft is properly initiated and replayed. As this is internal endpoint our only worry should be whether this stale data here could impact member bootstrap. We should be very careful when adding a linearizability requirement here. It could bring more harm than benefit. First I would want to see that we are able to find and reproduce issue with etcd member bootstrap that could be caused by stale data returned on |
Agree. Here is a historical issue I opened #14174 with reproduce. The conclusion is #14174 (comment)
|
Bug report criteria
What happened?
Restart etcd server right after member reconfiguration and query the member list via HTTP /members, its handler will bypass the linearizable check. The member list response would be stale during bootstrap (restart) where the members are restored from v2 store and WAL is still replaying...
What did you expect to happen?
When peer certificate is not used,
/members
should be protected by linearizable check.How can we reproduce it (as minimally and precisely as possible)?
cd integration && go test -v -run TestReproduceMemberNameMismatch
Anything else we need to know?
Relevant to discussion
#16666 (comment)
Etcd version (please run commands below)
Etcd configuration (command line flags or environment variables)
paste your configuration here
Etcd debug information (please run commands below, feel free to obfuscate the IP address or FQDN in the output)
Relevant log output
No response
The text was updated successfully, but these errors were encountered: