Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[k8s] Allow per-context identity configuration + in-cluster auth fixes #4136
base: master
Are you sure you want to change the base?
[k8s] Allow per-context identity configuration + in-cluster auth fixes #4136
Changes from 2 commits
658fc8e
52880a2
12278e8
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid the stacktrace with
ux_utils
?Will this work with failover, e.g., if an invalid
remote_identity
is specified, will it cause SkyPilot failing to failover to other clouds?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we enforce any context used to exist in the dict, should we add this to precheck for config, e.g. in the
schema.py
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah good catch, added
ux_utils
to hide stack trace.Currently this doesn't work with failover, but maybe that's desirable? This failure is not a failure of the cloud, but rather configuration failure on the user's end, so maybe good to surface this instead of silently failing over? lmk if you think otherwise, I can figure out a way to move this further downstream in the code so it becomes a part of failover.
That might be tricky since
schema.py
only supports simple json schema validation. We can consider adding toskypilot_config.py::_try_load_config()
; noting that might require importing k8s module dynamically if this field is specified.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not super sure of using this env var for passing around the context just for looking up the context in the remote_identity dict (correct me if I am wrong).
If this is mainly for looking up a remote identity for jobs/serve controller, it might be better to just replace the key in
remote_identity
toin-cluster
instead, like the one we did here:skypilot/sky/utils/controller_utils.py
Lines 600 to 626 in f2991b1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other than looking up
remote_identity
, this envvar is also required to maintain parity with the region<->context mapping that we have created for SkyPilot.E.g., if a user defines a YAML like:
This YAML does not work on
sky jobs launch
on current master:With this PR, we can have region map to the right context consistently, even if it may be running in-cluster.
This mapping is also needed in #4188 to allow our api server to use the context name instead of
in-cluster
when running in a Kubernetes cluster.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this be overwrite by a user specifying
env
in their pod_config inconfig.yaml
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, the dictionaries will be merged. E.g., with config.yaml like:
Jobs controller pod has: