Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce basic migration logic #428

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open

Introduce basic migration logic #428

wants to merge 3 commits into from

Conversation

limemloh
Copy link
Contributor

@limemloh limemloh commented Jan 24, 2025

Purpose

Introduce migration logic into the ccdscan-indexer binary, allowing us to run migrations from rust.
ccdscan-api now ensures the database schema version is compatible and ccdscan-indexer ensures it is the latest before running.

Changes

  • Change the DATABASE_URL environment variable for both binaries to allow for disabling compile-time checked queries while running the service and to follow the naming convention.
  • Update the .env.template with more settings and comments
  • Update readme with new instructions on how to run and introduce migrations.
  • Removed default value for node endpoint CLI arg

@limemloh limemloh requested review from lassemand and DOBEN January 24, 2025 11:30
@@ -0,0 +1,782 @@
CREATE EXTENSION IF NOT EXISTS pg_trgm;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we have this file? It seems to be pretty much the same as what is already there.

I assume it at the very least means we have to remove the old ones?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the old one needs to be removed and I will do that right before merging.
Merging in changes to the old file from main, is just simpler when I keep this around.

@@ -60,6 +64,9 @@ async fn main() -> anyhow::Result<()> {
.connect(&cli.database_url)
.await
.context("Failed constructing database connection pool")?;
// Ensure the database schema is compatible with supported schema version.
migrations::ensure_compatible_schema_version(&pool, SUPPORTED_SCHEMA_VERSION).await?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just do the same migration here instead.

During it this way we have an annoying situation where we are dependent on the api is initialising after the migration has occurred on the indexer which is a pretty annoying feature to implement deployment wise.

At least it cannot be done as is, since it happens application side.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really don't think this should be doing migrations.

Most database migrations require changes to the indexer, so if the API starts doing migrations, it might crash it.
New columns need to be populated by the indexer, before the API should start using them anyway.
Removing columns would crash the indexer.

We also don't want several migrations to be running at once, so since we only run one indexer instance, this is probably a better fit.

@@ -0,0 +1,264 @@
use anyhow::Context;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of having schema versions, why dont we just ensure that the sql is the same as what is defined within the .sql files?

Copy link
Contributor Author

@limemloh limemloh Jan 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure, I understand what you are suggesting.
If you mean ask the database for the schema and then compare with the sql files, it seems like a very complex solution when we start altering tables and remove stuff.
Also difficult to figure out when these are not backwards compatible

@@ -3,7 +3,6 @@ use async_graphql::{scalar, InputValueError, InputValueResult, Scalar, ScalarTyp

pub type Amount = UnsignedLong;
pub type TokenId = String;
pub type RewardPeriodLength = i64;
Copy link
Member

@DOBEN DOBEN Jan 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After a rebase on top of main, this line and the rust-submodule-link update should disappear.

PGPASSWORD=${DB_PASSWORD}

# gRPC interface of the node.
CCDSCAN_INDEXER_GRPC_ENDPOINTS=http://localhost:20000
Copy link
Member

@DOBEN DOBEN Jan 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use as default a public URL node. That would allow external people to easier start the service (even mention the URLs to nodes on all three networks) successfully since people don't need to figure out how to set up a node (or how to change this part to a public node).

CCDSCAN_INDEXER_DATABASE_URL=postgres://postgres:$PGPASSWORD@localhost/ccdscan

# Database connection used by the ccdscan-api.
CCDSCAN_API_DATABASE_URL=${CCDSCAN_INDEXER_DATABASE_URL}
Copy link
Member

@DOBEN DOBEN Jan 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Running make setup && make (as described in the READMe) overwrites the CCDSCAN_API_DATABASE_URL/PGPASSWORD string now to an empty string when run for the first time. The make files need to be tested with these changes.

I am a bit in favor of keeping it simple and telling people to rename the .env_template file to .env without providing make files. Our make files will get out of date or break earlier or later (if we keep them, we should have some CI pipelines running the make file long-term).


## Database schema setup and migrations

To set up the database schema either from an empty database or migration from an older release of `ccdscan-indexer` run:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add the command to start an empty database here.

@@ -39,7 +59,7 @@ cargo run --bin ccdscan-indexer -- --node https://grpc.testnet.concordium.com:20
Note: Since the indexer puts a lot of load on the node, use your own local node whenever possible.
If using the public nodes, run the indexer as short as possible.

<!-- TODO When service become stable: add documentation of arguments and environment variables. -->
Both binaries read variables from a `.env` file if present in the directory, use `.env.template` in this project as the starting point.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about the following order to get someone who reads the READme for the first time up-to-speed as fast as possible:

  • First paragraph includes all dependencies (## Dependencies and ## Setup for development paragraph).

  • Second paragraph about setups:

    • mention the envs file (so people know they need to set up envs before running/migrating anything) - e.g. sentence line 62
    • mention the database: e.g. how to get an empty database set-up (cli command).
  • Third paragraph: Migration/Running the services (rest of ReadMe).

```

where `<description>` is replaced by a short description of the nature of the migration.
NOTE: Having compile-time checked queries will cause issues, since the queries are invalid until the database have been properly migrated. This is done by _not_ having the `DATABASE_URL` environment variable set until after running the migrations or using:
Copy link
Member

@DOBEN DOBEN Jan 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This Note could be added to the ccdscan-api service further up as well with the your given cli command as a work around since it can happen in the ccdscan-api service as well.

#[display("0000:Empty database with no tables yet.")]
Empty = 0,
#[display(
"0001:Initial schema version with tables for blocks, transaction, accounts, contracts, \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"0001:Initial schema version with tables for blocks, transaction, accounts, contracts, \
"0001:Initial schema version with tables for blocks, transactions, accounts, contracts, \

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants