Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: move notebook service code #301

Merged
merged 179 commits into from
Jul 16, 2024
Merged

feat: move notebook service code #301

merged 179 commits into from
Jul 16, 2024

Conversation

olevski
Copy link
Member

@olevski olevski commented Jul 11, 2024

This PR aims to move (but not enable) the notebooks API from the notebook repo into here.

Enabling the handlers will be done in subsequent PRs.

Please look at: https://drive.google.com/file/d/1w40wZfFeG1H0MBjVp8wdsuw1z0gvWP-n/view?usp=sharing for a good diff of the move.

/deploy

rokroskar and others added 30 commits June 3, 2019 17:15
* refactor: minor formatting and unnecessary code removal

* feat: include notebook username in pod labels

* refactor: minor

* feat: use k8s API to get user Jupyter servers

* refactor: remove ununsed code

* fix: minor fix in README.rst

* feat: return server data as needed by UI

Co-authored-by: rokroskar <[email protected]>

* build: make tests pass

* fix: apply review comments
…ves (#190)

* fix: use shortened commit-sha to check for an autosave match

* fix: use commit-sha of remote branch to create autosave branch name

* fix: avoid a None dereferencing

* fix: use both local and remote commit-sha for autosave
fix: update telepresence script and add vs code debugger
* feat: handle k8s api errors when querying for logs (#216)
feat: return json data from server_logs and expose lines limit

fix #224
- Enable the use of tmpauthenticator for Jupyterhub authentication.
  This enables notebook sessions for logged-out users. Closes #171

- Enable logged in users without write access to a project to start
  notebook servers from the project. Closes #275

- Add a helper script which creates a helm values file for a secondary
  notebooks service deployment from the main Renku values file (some
  amount of guessing involved).
Add cronjob that periodically deletes users image pull secrets along with unit tests
* fix: cannot create user env due to invalid k8s secret labels
* fix: properly determine user server pod status with multiple containers

Co-authored-by: Andreas Bleuler <[email protected]>
* chore: simplify tests

* chore: make JH start once per session

* feat: add schemas and swagger

* feat: add swagger descriptions and security schema

* chore: add docstrings to schema classes

* fix: return server details when pending and spawning to prevent UI content flicker
* fix: add missing fields in renku-notebooks schemas

* chore: update tests

* fix: simplify definition of UserPodResources schema
@olevski olevski requested a review from a team as a code owner July 11, 2024 12:26
@Panaetius
Copy link
Member

Thank you for doing all this tedious work.
Review:
components/renku_data_services/notebooks/api/amalthea_patches/cloudstorage.py L9: UserServer does not need to be quoted
components/renku_data_services/notebooks/api/amalthea_patches/general.py L9: UserServer does not need to be quoted
components/renku_data_services/notebooks/api/amalthea_patches/git_proxy.py L11: UserServer does not need to be quoted
components/renku_data_services/notebooks/api/amalthea_patches/git_sidecar.pye L10: UserServer does not need to be quoted
components/renku_data_services/notebooks/api/amalthea_patches/init_containers.py L18: UserServer does not need to be quoted
components/renku_data_services/notebooks/api/amalthea_patches/init_containers.py L217: UserServer does not need to be quoted
components/renku_data_services/notebooks/api/amalthea_patches/inject_certificates.py L11: UserServer does not need to be quoted
components/renku_data_services/notebooks/api/amalthea_patches/jupyter_server.py L16: UserServer does not need to be quoted
components/renku_data_services/notebooks/api/amalthea_patches/jupyter_server.py L110: UserServer does not need to be quoted
components/renku_data_services/notebooks/api/amalthea_patches/jupyter_server.py L182: UserServer does not need to be quoted
components/renku_data_services/notebooks/api/amalthea_patches/jupyter_server.py L238: UserServer does not need to be quoted
components/renku_data_services/notebooks/api/auth.py: not clear from the diff if this file is empty or deleted, should be deleted
components/renku_data_services/notebooks/api/classes/data_service.py L3: we should not use logging directly but import from sanic.log logging
components/renku_data_services/notebooks/api/classes/data_service.py: I guess you'll do that in a followup PR but all the http calls in this file are superfluous and can be replaced with direct calls
components/renku_data_services/notebooks/api/classes/data_service.py L118: Question: It's a bit odd to me that CRCVAlidator has defaults for properties of ServerOptions, shouldn't the defaults be defirned in the latter?
components/renku_data_services/notebooks/api/classes/server.py L3: use sanic.log logging
components/renku_data_services/notebooks/api/classes/server.py L188: remove quotes in middle of string
components/renku_data_services/notebooks/api/classes/user.py L5: use sanic.log
components/renku_data_services/notebooks/api/classes/user.py L38: we should remove this property is we can use isinstance(...,AnonymousUser) instead
components/renku_data_services/notebooks/api/notebooks.py L4: use sanic.log
components/renku_data_services/notebooks/api/notebooks.py L131: shouldn't we model_validate the body?
components/renku_data_services/notebooks/api/notebooks.py L165:rm docstring: don't both methods launch using the old operator, but one launches 1.0 sessions and the other 2.0 sessions?
components/renku_data_services/notebooks/api/notebooks.py L677: the error message could be more descriptive, if I saw this in a log, I wouldn't know what the cause was.
components/renku_data_services/notebooks/api/notebooks.py L683 and 685: isn't this what empty is for?
components/renku_data_services/notebooks/api/schemas/errors.py L91: remove commented code?
components/renku_data_services/notebooks/api/schemas/servers_get.py L145: exit_code can never be int
components/renku_data_services/notebooks/config/init.py: TODO for later, move out of init.py
components/renku_data_services/notebooks/util/retries.py L 19 and others: Is Concatenate needed with a single param?
components/renku_data_services/notebooks/util/authn.py: We need to refactor this code. we create a RegisteredUser no matter what, check if authenticated is True (which it is if all required headers are present, but we don't actually check the content of the headers in the RegisteredUser init) and if not create an AnonymousUser. This seems pretty brittle and is a weird way for this code to be written. and if I read it correctly, I could put whatever I want in those headers and it would consider it a registered user?

does this pass ruff checks? I saw several places that still use Optional[] instead of x | None. It should fail pyupdate checks I think.
for some files like auth.py, health.py it's not clear from the html diff if they're empty or removed. I assume removed? if not, they should be

@olevski
Copy link
Member Author

olevski commented Jul 15, 2024

@Panaetius see my responses:

components/renku_data_services/notebooks/api/amalthea_patches/cloudstorage.py L9: UserServer does not need to be quoted
components/renku_data_services/notebooks/api/amalthea_patches/general.py L9: UserServer does not need to be quoted
components/renku_data_services/notebooks/api/amalthea_patches/git_proxy.py L11: UserServer does not need to be quoted
components/renku_data_services/notebooks/api/amalthea_patches/git_sidecar.pye L10: UserServer does not need to be quoted
components/renku_data_services/notebooks/api/amalthea_patches/init_containers.py L18: UserServer does not need to be quoted
components/renku_data_services/notebooks/api/amalthea_patches/init_containers.py L217: UserServer does not need to be quoted
components/renku_data_services/notebooks/api/amalthea_patches/inject_certificates.py L11: UserServer does not need to be quoted
components/renku_data_services/notebooks/api/amalthea_patches/jupyter_server.py L16: UserServer does not need to be quoted
components/renku_data_services/notebooks/api/amalthea_patches/jupyter_server.py L110: UserServer does not need to be quoted
components/renku_data_services/notebooks/api/amalthea_patches/jupyter_server.py L182: UserServer does not need to be quoted
components/renku_data_services/notebooks/api/amalthea_patches/jupyter_server.py L238: UserServer does not need to be quoted

Yes it has to be and I have to add the stuff I removed with if TYPE_CHECKING: import ... because without these then we get circular import errors. I will add comments about this.

components/renku_data_services/notebooks/api/classes/data_service.py: I guess you'll do that in a followup PR but all the http calls in this file are superfluous and can be replaced with direct calls

Yes I dont just want to get this merged and do these large changes.

components/renku_data_services/notebooks/api/amalthea_patches/cloudstorage.py L9: UserServer does not need to be quoted
components/renku_data_services/notebooks/api/amalthea_patches/general.py L9: UserServer does not need to be quoted
components/renku_data_services/notebooks/api/amalthea_patches/git_proxy.py L11: UserServer does not need to be quoted
components/renku_data_services/notebooks/api/amalthea_patches/git_sidecar.pye L10: UserServer does not need to be quoted
components/renku_data_services/notebooks/api/amalthea_patches/init_containers.py L18: UserServer does not need to be quoted
components/renku_data_services/notebooks/api/amalthea_patches/init_containers.py L217: UserServer does not need to be quoted
components/renku_data_services/notebooks/api/amalthea_patches/inject_certificates.py L11: UserServer does not need to be quoted
components/renku_data_services/notebooks/api/amalthea_patches/jupyter_server.py L16: UserServer does not need to be quoted
components/renku_data_services/notebooks/api/amalthea_patches/jupyter_server.py L110: UserServer does not need to be quoted
components/renku_data_services/notebooks/api/amalthea_patches/jupyter_server.py L182: UserServer does not need to be quoted
components/renku_data_services/notebooks/api/amalthea_patches/jupyter_server.py L238: UserServer does not need to be quoted

Yes it has to be and I have to add the stuff I removed with if TYPE_CHECKING: import ... because without these then we get circular import errors.

components/renku_data_services/notebooks/api/auth.py: not clear from the diff if this file is empty or deleted, should be deleted

It is deleted

components/renku_data_services/notebooks/api/classes/data_service.py L3: we should not use logging directly but import from sanic.log logging

Done

components/renku_data_services/notebooks/api/classes/data_service.py: I guess you'll do that in a followup PR but all the http calls in this file are superfluous and can be replaced with direct calls

Yes I want to do this a followup PR.

components/renku_data_services/notebooks/api/classes/data_service.py L118: Question: It's a bit odd to me that CRCVAlidator has defaults for properties of ServerOptions, shouldn't the defaults be defirned in the latter?

Done

components/renku_data_services/notebooks/api/classes/server.py L3: use sanic.log logging

Done

components/renku_data_services/notebooks/api/classes/server.py L188: remove quotes in middle of string

Done

components/renku_data_services/notebooks/api/classes/user.py L5: use sanic.log

Done

components/renku_data_services/notebooks/api/classes/user.py L38: we should remove this property is we can use isinstance(...,AnonymousUser) instead

Done

components/renku_data_services/notebooks/api/notebooks.py L4: use sanic.log

Done

components/renku_data_services/notebooks/api/notebooks.py L131: shouldn't we model_validate the body?

No need, we already use @validate(json=apispec.LaunchNotebookRequest) on handler which will do that for the request body.

components/renku_data_services/notebooks/api/notebooks.py L165:rm docstring: don't both methods launch using the old operator, but one launches 1.0 sessions and the other 2.0 sessions?

Yes, changed it.

components/renku_data_services/notebooks/api/notebooks.py L677: the error message could be more descriptive, if I saw this in a log, I wouldn't know what the cause was.

Agreed I will leave this as a TODO in a followup PR beacuse that section has the weird change to the code where I indented stuff for a better diff and I want to undo that before merging. I can also probably do this after I undo the changes for indentation for a better diff.

components/renku_data_services/notebooks/api/notebooks.py L683 and 685: isn't this what empty is for?

Is that a function from sanic? Sure I can use that. Same as above I will do it in a follow up.

components/renku_data_services/notebooks/api/schemas/errors.py L91: remove commented code?

Yes. done.

components/renku_data_services/notebooks/api/schemas/servers_get.py L145: exit_code can never be int

That was mean to be exit_code = str(exit_code_raw) if not isinstance(exit_code_raw, int) else exit_code_raw . And in this case exit_code can be int or string. I added that ternary if statement.

components/renku_data_services/notebooks/config/init.py: TODO for later, move out of init.py

Agreed

components/renku_data_services/notebooks/util/retries.py L 19 and others: Is Concatenate needed with a single param?

Mypy was not happy when I removed Concatenate even though it is just a single argument. But it probably wants that because it is not a regular argument but a ParamSpec.

components/renku_data_services/notebooks/util/authn.py: We need to refactor this code. we create a RegisteredUser no matter what, check if authenticated is True (which it is if all required headers are present, but we don't actually check the content of the headers in the RegisteredUser init) and if not create an AnonymousUser. This seems pretty brittle and is a weird way for this code to be written. and if I read it correctly, I could put whatever I want in those headers and it would consider it a registered user?

So the tokens are checked but the signatures are not verified. And we should be verifying the signatures ,just to be safe. Currently from outside of the cluster there is no way to get to the notebook service. And the gateway will always inject verified things in the headers. Either we should clean this up but in the end I think we simply will not use this at all and just fall back to the existing authenticator used everywhere in the data services. So this code will be fully purged. Either way this is a TODO.

does this pass ruff checks? I saw several places that still use Optional[] instead of x | None. It should fail pyupdate checks I think.

It should. I will remove the indentation changes and run the linting.

for some files like auth.py, health.py it's not clear from the html diff if they're empty or removed. I assume removed? if not, they should be

renku_notebooks/api/auth.py and renku_notebooks/api/health.py are both gone - just double checked.

@olevski
Copy link
Member Author

olevski commented Jul 15, 2024

Here is an issue with all followup work: #303

@olevski olevski force-pushed the move-notebook-service branch from 4c1247f to c60d183 Compare July 15, 2024 14:27
@olevski olevski force-pushed the move-notebook-service branch from 9ed7b22 to 129d8b3 Compare July 15, 2024 14:58
@olevski olevski temporarily deployed to renku-ci-ds-301 July 15, 2024 14:59 — with GitHub Actions Inactive
@RenkuBot
Copy link
Contributor

You can access the deployment of this PR at https://renku-ci-ds-301.dev.renku.ch

@olevski olevski temporarily deployed to renku-ci-ds-301 July 15, 2024 15:51 — with GitHub Actions Inactive
@coveralls
Copy link

Pull Request Test Coverage Report for Build 9942973186

Details

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • 5 unchanged lines in 3 files lost coverage.
  • Overall coverage increased (+0.03%) to 90.359%

Files with Coverage Reduction New Missed Lines %
components/renku_data_services/base_api/misc.py 1 97.62%
components/renku_data_services/crc/db.py 2 87.98%
components/renku_data_services/storage/blueprints.py 2 94.65%
Totals Coverage Status
Change from base Build 9937114559: 0.03%
Covered Lines: 8866
Relevant Lines: 9812

💛 - Coveralls

@Panaetius
Copy link
Member

@Panaetius see my responses:

components/renku_data_services/notebooks/api/amalthea_patches/cloudstorage.py L9: UserServer does not need to be quoted
[...]
components/renku_data_services/notebooks/api/amalthea_patches/jupyter_server.py L238: UserServer does not need to be quoted

Yes it has to be and I have to add the stuff I removed with if TYPE_CHECKING: import ... because without these then we get circular import errors. I will add comments about this.

I didn't know that quoting usage but not the import prevents circular import. In any case, we should tidy this up so we don't have circular imports, like with injecting dependencies instead of importing directly in UserServer. Anyways, todo for later

components/renku_data_services/notebooks/api/notebooks.py L683 and 685: isn't this what empty is for?

Is that a function from sanic? Sure I can use that. Same as above I will do it in a follow up.

Yes it's a sanic function

from sanic import empty
[...]
    return empty(204)

for example

components/renku_data_services/notebooks/util/retries.py L 19 and others: Is Concatenate needed with a single param?

Mypy was not happy when I removed Concatenate even though it is just a single argument. But it probably wants that because it is not a regular argument but a ParamSpec.

#justmypythings I guess 🤷

@olevski olevski merged commit 6f3b3c0 into main Jul 16, 2024
15 checks passed
@olevski olevski deleted the move-notebook-service branch July 16, 2024 07:37
@RenkuBot
Copy link
Contributor

Tearing down the temporary RenkuLab deplyoment for this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants