Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bridge-indexer] revamp task #19245

Merged
merged 2 commits into from
Sep 8, 2024
Merged

[bridge-indexer] revamp task #19245

merged 2 commits into from
Sep 8, 2024

Conversation

longbowlu
Copy link
Contributor

@longbowlu longbowlu commented Sep 6, 2024

Description

This PR reworks Tasks:

  1. get rid of trait Tasks and create struct Tasks instead.
  2. add is_live_task field to Task
  3. pass Task to several functions instead of its parameters.
  4. for ingestion framework, use a custom batch read size for backfill tasks (this significantly improves the data download speed)

Test plan

How did you test the new or updated feature?


Release notes

Check each box that your changes affect. If none of the boxes relate to your changes, release notes aren't required.

For each box you select, include information after the relevant heading that describes the impact of your changes that a user might notice and any actions they must take to implement updates.

  • Protocol:
  • Nodes (Validators and Full nodes):
  • Indexer:
  • JSON-RPC:
  • GraphQL:
  • CLI:
  • Rust SDK:
  • REST API:

Copy link

vercel bot commented Sep 6, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
sui-docs ✅ Ready (Inspect) Visit Preview 💬 Add feedback Sep 8, 2024 0:26am
3 Skipped Deployments
Name Status Preview Comments Updated (UTC)
multisig-toolkit ⬜️ Ignored (Inspect) Visit Preview Sep 8, 2024 0:26am
sui-kiosk ⬜️ Ignored (Inspect) Visit Preview Sep 8, 2024 0:26am
sui-typescript-docs ⬜️ Ignored (Inspect) Visit Preview Sep 8, 2024 0:26am

@longbowlu longbowlu changed the title revamp task [bridge-indexer] revamp task Sep 6, 2024
@longbowlu longbowlu marked this pull request as ready for review September 6, 2024 05:37
Copy link
Contributor

@dariorussi dariorussi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great, thanks!
let's get going!

@@ -63,14 +64,13 @@ impl EthSubscriptionDatasource {
impl Datasource<RawEthData> for EthSubscriptionDatasource {
async fn start_data_retrieval(
&self,
starting_checkpoint: u64,
target_checkpoint: u64,
task: Task,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like that

@@ -23,6 +23,9 @@ use tokio::task::JoinHandle;

use crate::metrics::BridgeIndexerMetrics;

const BACKFILL_TASK_INGESTION_READER_BATCH_SIZE: usize = 300;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this is the value that defines how much is going to be read in one shot?
Can you add a comment as to what this mean?
Also is this something that should be defined in the config file besides and env variable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the fetching concurrency in ingestion framework. https://github.com/MystenLabs/sui/blob/main/crates/sui-data-ingestion-core/src/reader.rs#L207
it guarantees at most N checkpoints can be fetched at the same time. A recent change downgraded the default value to 10, which significantly bottlenecked the backfill speed. In backfill tasks we should use a larger number, in live sync 10 is fine.

will add in the comment

@dariorussi
Copy link
Contributor

@longbowlu one final question, is this going to affect the deepbook indexer or is this all private implementation to the indexer "framework"?

@longbowlu longbowlu force-pushed the integrate-progress-saving-policy branch from 411ec7b to 11f47cc Compare September 7, 2024 22:29
Base automatically changed from integrate-progress-saving-policy to main September 7, 2024 23:13
@longbowlu
Copy link
Contributor Author

longbowlu commented Sep 8, 2024

@longbowlu one final question, is this going to affect the deepbook indexer or is this all private implementation to the indexer "framework"?

it will affect deepbook because deepbook is using this framework, right? If you are asking about if DB indexer need to change anything, i think the only thing to add is
async fn register_live_task for IndexerProgressStore

@longbowlu longbowlu enabled auto-merge (squash) September 8, 2024 00:26
@longbowlu longbowlu merged commit dd951f8 into main Sep 8, 2024
49 checks passed
@longbowlu longbowlu deleted the revamp-task branch September 8, 2024 00:39
suiwombat pushed a commit that referenced this pull request Sep 16, 2024
## Description 

This PR reworks `Tasks`:
1. get rid of trait `Tasks` and create struct `Tasks` instead.
2. add `is_live_task` field to `Task`
3. pass `Task` to several functions instead of its parameters.
4. for ingestion framework, use a custom batch read size for backfill
tasks (this significantly improves the data download speed)

## Test plan 

How did you test the new or updated feature?

---

## Release notes

Check each box that your changes affect. If none of the boxes relate to
your changes, release notes aren't required.

For each box you select, include information after the relevant heading
that describes the impact of your changes that a user might notice and
any actions they must take to implement updates.

- [ ] Protocol: 
- [ ] Nodes (Validators and Full nodes): 
- [ ] Indexer: 
- [ ] JSON-RPC: 
- [ ] GraphQL: 
- [ ] CLI: 
- [ ] Rust SDK:
- [ ] REST API:
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants