Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggestion: reject configurations that don't match rank 0 #6391

Open
kkier opened this issue Oct 25, 2024 · 5 comments
Open

Suggestion: reject configurations that don't match rank 0 #6391

kkier opened this issue Oct 25, 2024 · 5 comments

Comments

@kkier
Copy link
Contributor

kkier commented Oct 25, 2024

Reference #6389

My understanding is that there's no situation where a mismatched configuration would be desirable. If that's true, it'd be useful to outright reject connections from nodes with a mismatched configuration, preferably with an error message indicating what setting doesn't match. Similar to the way version mismatches cause a rejection and no further processing, just to prevent downstream issues.

@garlick
Copy link
Member

garlick commented Oct 25, 2024

Hmm, that's definitely possible to check at connect time, although two concerns:

  1. Most configuration is not sensitive to being different. In fact a lot of it only applies to rank 0

  2. Some config can be updated on the fly. How could we ensure an updated config matches upstream without making it really awkward to push out an update?

@wihobbs
Copy link
Member

wihobbs commented Oct 25, 2024

My understanding is that there's no situation where a mismatched configuration would be desirable.

At connection time, this is probably correct. However, for a running instance with a running connection, we use "mismatched" configurations to test prologs/epilogs (this is somewhat related to #5531). Start a job across a bunch of nodes, then mess with the imp to test out a new prolog/epilog. Just something to keep in mind.

@kkier
Copy link
Contributor Author

kkier commented Oct 25, 2024

Hmm, that's definitely possible to check at connect time, although two concerns:

1. Most configuration is _not_ sensitive to being different.  In fact a lot of it only applies to rank 0

2. Some config can be updated on the fly.   How could we ensure an updated config matches upstream without making it really awkward to push out an update?

Re: 1 - Are there things that aren't sensitive to being different and where you might want them to be so, beyond test cases? It definitely adds some complexity if we'd have to create a list of what does and doesn't matter.

My understanding is that there's no situation where a mismatched configuration would be desirable.

At connection time, this is probably correct. However, for a running instance with a running connection, we use "mismatched" configurations to test prologs/epilogs (this is somewhat related to #5531). Start a job across a bunch of nodes, then mess with the imp to test out a new prolog/epilog. Just something to keep in mind.

Oh, yes, I'm just thinking in terms of connection time. Constantly checking the config seems very out of scope.

@wihobbs
Copy link
Member

wihobbs commented Oct 25, 2024

Sorry I misunderstood! Just wanted to throw that sort of niche edge case in there.

@kkier
Copy link
Contributor Author

kkier commented Oct 25, 2024

Sorry I misunderstood! Just wanted to throw that sort of niche edge case in there.

It's a good point, taken as written you can totally read it as "if the config changes, drop the node" which is also a much more complicated issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants