Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sort envs returned by REST API by current build's scheduled_on time #881

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

peytondmurray
Copy link
Contributor

Fixes #859.

Description

This PR sorts environments returned when a GET is sent to /environment/ by the time the environments' current builds were submitted. In systems where users are creating environments while environments are being queried, this ensures that all results are returned; other sorting methods (e.g. by name) can yield incomplete results.

Pull request checklist

  • Did you test this change locally?
  • Did you update the documentation (if required)?
  • Did you add/update relevant tests for this change (if required)?

Copy link

netlify bot commented Sep 17, 2024

Deploy Preview for conda-store canceled.

Name Link
🔨 Latest commit 9122e61
🔍 Latest deploy log https://app.netlify.com/sites/conda-store/deploys/678abf0fcb6ba70008c94e05

@peytondmurray peytondmurray force-pushed the 859-sorted-pagination branch 2 times, most recently from 7a9b540 to 0d57b2b Compare September 26, 2024 21:51
@peytondmurray peytondmurray force-pushed the 859-sorted-pagination branch from 871e0e3 to 0442238 Compare October 3, 2024 21:10
@peytondmurray peytondmurray changed the title [WIP] Sort envs returned by REST API by current build's scheduled_on time Sort envs returned by REST API by current build's scheduled_on time Oct 3, 2024
@peytondmurray peytondmurray marked this pull request as ready for review October 17, 2024 03:11
@peytondmurray peytondmurray requested a review from soapy1 October 17, 2024 03:11
@peytondmurray peytondmurray marked this pull request as draft October 17, 2024 18:36
@peytondmurray
Copy link
Contributor Author

After a discussion, it seems like no matter how we solve this there's going to be a breaking change to the API. Cursor-based pagination will allow for any sorting order, so I'll look into implementing that, probably through fastapi-pagination.

@peytondmurray
Copy link
Contributor Author

peytondmurray commented Oct 18, 2024

Adding pagination via fastapi-pagination seems simple on the surface, but they don't publish an API reference in the docs, so without reading the source for the project I'm not sure how it affects what query parameters are accepted by our API. So in the process of trying to implement this, I started working on adding hot reloading, but due to the fact that we're using traitlets to load server configuration (and instantiate the FastAPI app) and that uvicorn has specific requirements about how the FastAPI app is instantiated when hot reloading, it's trickier to get this working than I thought. See #901 for details.

@peytondmurray peytondmurray force-pushed the 859-sorted-pagination branch 2 times, most recently from 9e55051 to 847eed4 Compare November 1, 2024 21:28
@peytondmurray
Copy link
Contributor Author

Hot reloading's done, but the fastapi_pagination docs aren't enough to get this working easily, and in the time it would take to actually read through the source for the project I think we can just implement a simple cursor-based paginator instead :/

@peytondmurray peytondmurray force-pushed the 859-sorted-pagination branch 2 times, most recently from 7b5a0bc to dd5a139 Compare November 6, 2024 21:59
@peytondmurray peytondmurray force-pushed the 859-sorted-pagination branch 4 times, most recently from b673196 to 54f7907 Compare January 3, 2025 23:53
@peytondmurray peytondmurray force-pushed the 859-sorted-pagination branch 2 times, most recently from 664d122 to 63db9d0 Compare January 16, 2025 21:56
@peytondmurray
Copy link
Contributor Author

Not sure if this is actually ready to be merged, as it is not currently compatible with the frontend, and a discussion about REST API versioning needs to be had before this approach can be merged. But otherwise, tests are passing and the implementation is I think mostly complete :) @soapy1

Copy link
Contributor

@soapy1 soapy1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good! Thank you for adding such beautiful formatted and thorough docs to a lot of these functions 🤩

Just a few things to sort out.

I took notes on what I tested manually in this gist https://gist.github.com/soapy1/beae935f78725cfd1dd0897c1ce016d4

"name": orm.Environment.name,
},
default_sort_by=["namespace", "name"],
paginated, next_cursor, count = paginate(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like responses for this endpoint includes the current_build, Previously this endpoint didn't include that field in the output.

I think this is an ok change, but calling it in case this was not intentional.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current_build field is an optional field in the Environment model, so originally I thought there's no harm in allowing it to be returned here unless there's some specific reason to strip it out. Do you think this was intended to make the response smaller?

Anyway, the current_build_id is kept here, and FastAPI has a way of excluding fields so I've just made use of that to capture the old behavior.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think this was intended to make the response smaller?

could be?

columns = []
if order_by:
for order_name in order_by:
idx = self.order_names.index(order_name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to have some error handling for if users provide bad sort_by arguments (or do some validation on the validity of the arguments earlier in the call).

Currently, if you provide a bad sort_by argument it causes an internal server error:

$ curl http://localhost:8080/conda-store/api/v1/environment/\?limit\=100\&sort_by\=oops                                           
Internal Server Error%                                 

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh, thanks for this - I've added error handling and a test for bad query parameters!

from conda_store_server.conda_store import CondaStore
from conda_store_server.exception import CondaStoreError
from conda_store_server.server import schema as auth_schema
from conda_store_server.server.auth import Authentication
from conda_store_server.server.schema import AuthenticationToken, Permissions


def get_cursor(cursor: Optional[str] = None) -> Cursor:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whats the motivation for separating the the cursor from the rest of the pagination args?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's really no reason other than that the cursor based pagination takes a lot more code than limit/offset, so I've broken that off into a separate module, so that views/api.py is mostly endpoints. If it makes more sense to keep this as part of views/api.py I'm happy to do so, however.

return Cursor.load(cursor)


def get_cursor_paginated_args(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like most of these depends type functions are in the conda_store_server/_internal/server/dependencies module. I think it would be good to consolidate these into that module.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Totally agree, thank you!

@soapy1
Copy link
Contributor

soapy1 commented Jan 17, 2025

One more question (related to how we want to approach versioning for this change): how do we want to approach pagination for the rest of the api? Do we want all the other endpoints to also adopt this type of pagination?

@peytondmurray
Copy link
Contributor Author

I vote yes - as far as I can tell the advantages of limit/offset are that it's easy to implement, but the cost of it is exactly what is described in #859. Cursor based pagination addresses that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: api 🌐 area: user experience 👩🏻‍💻 Items impacting the end-user experience needs: review 👀 type: bug 🐛 Something isn't working
Projects
Status: In Progress 🏗
Development

Successfully merging this pull request may close these issues.

[ENH] - Ensure completeness when fetching all pages using REST API
2 participants