Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unclear on how to proceed with future migrations #1040

Open
stratosgear opened this issue Apr 17, 2024 · 13 comments
Open

Unclear on how to proceed with future migrations #1040

stratosgear opened this issue Apr 17, 2024 · 13 comments
Labels
Issue appropriate for: Occasional contributors 😉 This issue will be best tackled by people with a minimum of experience on the project Issue contains: Exploration & Design decisions 🧩 We don't know how this will be implemented yet Issue contains: Some SQL 🐘 This features require changing the SQL model Issue status: Blocked ⛔️ Issue cannot be processed for now

Comments

@stratosgear
Copy link

I tried to understand how the project handles schema migrations but after reading all the documentation pages regarding migrations and browsing through related existing open/closed issues, I have not found a concrete explanation of how it works.... :(

In my use case I have introduced procrastinate in an existing code base (non Django based).

I have executed: procrastinate -a my.src.app schema --apply that correctly applied the procrastinate structures required. Procrastinate seems to be working fine.

My concern now is to how can I remain current with all potential migrations that might be coming along in the future.

I was hoping that I would be able to keep executing procrastinate -a my.src.app schema --apply everytime I update my python dependencies and during the project startup, and hopefully automatically catch any potential future migrations required, but I am not sure if this will actually work.

Am I right in thinking that I have to somehow have to adopt any new procrastinate migration scripts as my own and find a way to apply them myself with my existing migration methodology (basically using Alembic)?

Because this is something really fragile and will require a lot of coordinated work to implement, test and maintain. This will increase the friction of adopting procrastinate too much! :(

Am I missing an obvious solution?

@ewjoachim
Copy link
Member

I think you're right, except if you use Django.

When we worked on the migration system, we wanted to keep it a bit minimal to avoid tying to a specific system, since there were multiple existing system and choosing one would likely have made it very difficult for people using another one. Since migration systems usually come with their own way of tracking migrations it would be complicated to add our custom way of tracking what has been run or not.

The one may thing we commit to doing is that each release lists its migrations. Each migration script is written so as to be runnable as-is, if you need to modify it it's probably a bug, and most migration systems accept migrations where you give the SQL code to run directly, so they should be compatible.
Also, we've made a dedicated Django integration that lists procrastinate migrations as Django migrations.

But then you're right, nothing yet has been made to ease that part of the lib.

We could do the same with Alembic as we do with Django, I guess, it would probably cover most of what people would be using. I've never played with alembic so I'd love if someone would like to have a look.

@stratosgear
Copy link
Author

Although I have not done the full analysis, why Procrastinate cannot handle it's own migrations?

For example Alembic, keeps a table where it notes what was the last migration script that was applied.

This process, or something similar, could be maintained internally from Procrastinate, and NOT be the responsibility of the user to take care of an external dependency. Procrastinate already provides a schema manager, in the form of the original cli that applies the schema, so why not extend it a bit and whenever it runs, it checks for any missing migrations, and apply them, otherwise gracefully exit mentioning that everything is up to date.

I mean maybe I oversimplify things, but it seems Procrastinate already deals with much heavier concepts here, auto-handling the migrations should be peanuts! :)

@ewjoachim
Copy link
Member

We could, but... I don't like doing in one lib things that [I feel] might get quite complicated and is something complex enough that I would imagine there would be dedicated other libs to do it right.

I'm not saying we can't do it here, but if we did, we'd need:

  • Deciding on a storage mechanism. It could be just the comments on the main table, or a dedicated table.
  • The code to read and write the version number
  • CLI args to:
    • Migrate to the latest version
    • Migrate to a specific version (with a check that we're not going backwards ? )
    • Force-write a migration number, in case you're going to apply one migration yourself (e.g. if you want to modify it)
  • Associated tests & docs
  • Optionally a way to disable those migrations with Django, otherwise it's a footgun (if you type ./manage.py procrastinate migrate instead of ./manage.py migrate, you could get in trouble)

It's perfectly doable. But it's not trivial. I'm not sure most people want multiple migration systems to cooperate (potentially on the same database) and I'm pretty sure sys admins are not going to be happy when they need to run 2 different migration commands upon deployment. When possible, I really think that if you already have a migration system for your app, you'd rather have Procrastinate use that. At least, I'd want that.

@stratosgear
Copy link
Author

Well, I am sorry to say but deciding to not deal with any of these, you are pushing the burden to someone else, not familiar with your codebase, to take on additional responsibilities in order to maintain it. We do not feel it is appropriate to separately deal with each third party utility/extension/plugin that considers it's too much work maintaining its own execution environment. And to be totally clear, by no means you are obligated to do so. It's just that Procrastinate does not fit our needs, in which case, no hard feelings! :)

I think the issue can be closed, since it has verified my initial concern!

Thanks!

@ewjoachim
Copy link
Member

ewjoachim commented Apr 19, 2024

Sorry :) Maybe I'll reconsider at some point. I understand your point, but this is a one-person volunteer lib until more people step in, and not my only open-source commitment, so I need to be realistic on what I can/want to work on.

I think it's worth keeping it open if other people want to chime-in. Your point is valid, and even if you chose another lib, it's always worth listening to feedback.

(If someone is interested to contribute, please discuss it first)

@ewjoachim ewjoachim reopened this Apr 19, 2024
@ewjoachim ewjoachim added Issue contains: Some SQL 🐘 This features require changing the SQL model Issue contains: Exploration & Design decisions 🧩 We don't know how this will be implemented yet Issue appropriate for: Occasional contributors 😉 This issue will be best tackled by people with a minimum of experience on the project Issue status: Blocked ⛔️ Issue cannot be processed for now labels Apr 19, 2024
@medihack medihack mentioned this issue Jun 6, 2024
10 tasks
@medihack
Copy link
Member

medihack commented Jun 28, 2024

I wonder what possible ways to improve the situation here would be. Maybe an additional table where every applied migration is captured. Then, in the first step, at least a developer using Procrastinate could check (with some command) which migrations were applied. The schema.sql file would be unnecessary then, as the migrations have to be applied (in the correct order) by some script initially. Then, in the next step, a script that will automatically apply later migrations when updating Procrastinate. But somehow, this should only affect non Django users.

EDIT:
@ewjoachim I just read you had the same ideas.

Another option I can think of (I mentioned it somewhere else) is to always use a custom migration management and only apply those Django specific model migrations in Django. Then we could hook into the Django migration system using a signal (pre_migrate or post_migrate) and execute our own migration system. Not sure how backward-compatible this would be.

@ewjoachim
Copy link
Member

ewjoachim commented Jun 28, 2024

I wonder how much of the community doesn't use either Django nor Alembic. Would it be acceptable to provide Alembic migrations alongside with Django and it would be enough for the vast majority of users ?

Otherwise: maybe we could integrate a standalone migration system, such as alembic which is tied to sqlalchemy but could be used independently, or yoyo or any other stadnalone migration system, within procrastinate as an optional dependency.

(To be super extra duper clear: I used to be the maintainer of Septentrion (yet another migration tool) and what I've learned is that it's enough of a complex thing to do to deserve its own lib and not be something we want to do in our own codebase.)

I'm perfectly ok revisiting the decision of letting user deal with it, but I think I really don't want to maintain our own solution.

  • Choose a good migration manager
  • Integrate it (with a new contrib folder) with the CLI procrastinate schema migrate
  • Case closed

@ewjoachim
Copy link
Member

In our own tests, we use migra. As you can see I have to do all sort of shenanigans when importing it because it seems unmaintained, and also based on schemainspect which seems equally unmaintained. It's the opportunity to remove the dep.

@medihack
Copy link
Member

medihack commented Jun 28, 2024

Yes, this makes sense. And I can least estimate how many non Django users are using Procrastinate. As I am a Django user myself the priority regarding this issue is not very high, but maybe it's still good to evolve a plan that somebody else can easily hop in to improve the situation (otherwise it looks more like a better not touch issue 😉).

@slifty
Copy link

slifty commented Jul 9, 2024

non-django user here (though I am using alembic)!

I'd very much value a way to avoid having to build my own migration management tool in order to use procrastinate safely in a CI / CD based production environment. That said: Alembic support would absolutely suit my needs. It seems to me that this would provide a solution for non-django users.

If there is maintainer comfort with this direction, I'd be glad to take a stab at implementing support for this, but would value any opinions on approach!

@ewjoachim
Copy link
Member

Nice :)

I'm going to push my luck: would you be interested in developing it? Of course, we'll do our best to support you!

@slifty
Copy link

slifty commented Jul 9, 2024

Yes! I expect the best approach would be a draft PR that lays out an initial implementation that you can give feedback to.

Stay tuned...

@ewjoachim
Copy link
Member

You'll probably be interested to look how Django migrations are done.

3 steps:

  • A custom sql migration class to ease using an official procrastinate migrations
  • the migrations themselves written manually
  • a test to check we didn't forget one migration

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Issue appropriate for: Occasional contributors 😉 This issue will be best tackled by people with a minimum of experience on the project Issue contains: Exploration & Design decisions 🧩 We don't know how this will be implemented yet Issue contains: Some SQL 🐘 This features require changing the SQL model Issue status: Blocked ⛔️ Issue cannot be processed for now
Projects
None yet
Development

No branches or pull requests

4 participants