The importer component is responsible for downloading stream files uploaded by consensus nodes, verifying them, and ingesting the normalized data into the database.
The mirror node strives to be both backwards and forwards compatible with consensus nodes. Most of the time, older versions of the mirror node should work against the latest version of consensus nodes. However, while it won't fail if new fields or transactions are added to HAPI, the mirror node may fail to capture that extra information until it is updated. The exception to this rule is when the stream format itself changes like when Record File v6 was introduced or the balance file was deprecated. The below table provides a compatibility matrix of the most recent changes:
HAPI Version | Mirror Node Version | Breaking Change | HIP |
---|---|---|---|
v0.37.0 | v0.75.0+ | False | HIP-583: Expand alias support |
v0.40.0 | v0.84.0+ | False | HIP-729: Contract account nonce externalization |
v0.42.0 | v0.88.0+ | True | HIP-794: Sunsetting balance file |
v0.43.0 | v0.89.0+ | False | HIP-786: Enriched staking metadata exports |
v0.47.0+ | v0.98.0+ | False | HIP-844: Handling and externalisation improvements for account nonce updates |
v0.48.0+ | v0.101.0+ | False | HIP-646/657/765: Mutable token metadata |
The importer tracks up-to-date entity balance by applying balance changes from crypto transfers. This relies on the InitializeEntityBalanceMigration to set the correct initial entity balance from the latest account balance snapshot relative to the last record stream file the importer has ingested and balance changes from crypto transfers not accounted in the snapshot. If the importer is started with a database which doesn't meet the prerequisite (e.g., an empty database) or the entity balance is inaccurate due to bugs, follow the steps below to re-run the migration to fix it.
-
Stop importer
-
Get the latest checksum of
InitializeEntityBalanceMigration
$ psql -h db -U mirror_node -c "select checksum from flyway_schema_history \ where script like '%InitializeEntityBalanceMigration' order by installed_rank desc limit 1"
-
Set a different checksum (e.g., 2) for the migration and start importer
hedera: mirror: importer: migration: initializeEntityBalanceMigration: checksum: 2
The following resource allocation and configuration is recommended to speed up historical data ingestion. The importer should be able to ingest one month's worth of mainnet data in less than 1.5 days.
Run the importer with 4 vCPUs and 10 GB of heap. Configure the application.yml:
hedera:
mirror:
importer:
downloader:
batchSize: 600
record:
frequency: 1ms
parser:
record:
entity:
redis:
enabled: false
frequency: 10ms
queueCapacity: 40
Note once the importer has caught up all data, it's recommend to change the configuration back to the default.
Run a PostgreSQL 16 instance with at least 4 vCPUs and 16 GB memory. Set the following parameters (note the unit is kilobytes):
max_wal_size = 8388608
work_mem = 262144
The RecordFileParserPerformanceTest
can be used to declaratively generate a RecordFile
with different performance
characteristics and test how fast the importer can ingest them. To configure the performance test, populate the remote
database information and the test scenarios in an application.yml
. Use the standard hedera.mirror.importer.db
properties to target the remote database. The below config is generating a mix of crypto transfer and contract calls
transactions at a combined 300 transactions per second (TPS) sustained for 60 seconds:
hedera.mirror.importer.parser.record:
performance:
duration: 60s
transactions:
- entities: 10
tps: 100
type: CRYPTOTRANSFER
- entities: 5
tps: 200
type: CONTRACTCALL
To run performance tests, use Gradle to run the performanceTest
task. To run all performance tests omit the tests
parameter.
./gradlew :importer:performanceTest --tests 'RecordFileParserPerformanceTest' --info
The reconciliation job verifies that the data within the stream files are in sync with each other and with the mirror node database. This process runs once a day at midnight and produces logs, metrics, and alerts if reconciliation fails.
For each balance file, the job verifies it sums to 50 billion hbars. For every pair of balance files, it verifies the aggregated hbar transfers in that period match what's expected in the next balance file. It also verifies the aggregated token transfers match the token balance and that the NFT transfers match the expected NFT count in the balance file.
For local testing the importer can be run using the following command:
./gradlew :importer:bootRun --args='--spring.profiles.active=v2'
The citus image is built and then pushed to our Google Cloud Registry. A multi-platform alpine linux image is built in order to be used for local testing (arm64 on M-series Macs). This image will need to be maintained until the upstream builder provides a multi-platform build supporting arm64.
In production and ci environments, citus is enabled through stackgres SGShardedCluster and doesn't use the custom image.
DockerFile
to build a custom image to be used in v2 testing is located
in Citus's upstream repository. To build this image:
You authenticate with the registry via gcloud:
gcloud auth configure-docker gcr.io
You should see output similar to:
Adding credentials for: gcr.io
The instructions here pertain to building the multi-platform alpine image on an M-series Mac using Docker Desktop. Please refer to Multi-platform images for an overview on how to use Docker Desktop to do this.
If you've not done so already since installing Docker Desktop, create a new builder:
$ docker buildx create --name mybuilder --driver docker-container --bootstrap 1 ↵
mybuilder
[+] Building 12.6s (1/1) FINISHED
=> [internal] booting buildkit 12.6s
=> => pulling image moby/buildkit:buildx-stable-1 11.9s
=> => creating container buildx_buildkit_mybuilder0
Switch to the new builder:
docker buildx use mybuilder
Checkout the upstream project containing the docker file
git clone [email protected]:citusdata/docker.git
cd docker
Build the image for both amd64 (Github CI) and arm64 (local testing). Don't forget the '.' at the end of the line. This can take some time depending on your internet speed. You will see activity around both arm64 and amd64.
$ docker buildx build --platform linux/arm64,linux/amd64 -t gcr.io/mirrornode/citus:12.1.1 --file alpine/Dockerfile .
[+] Building 351.1s (28/28) FINISHED
=> [internal] load .dockerignore 0.0s
=> => transferring context: 2B 0.0s
=> [internal] load build definition from Dockerfile 0.0s
=> => transferring dockerfile: 3.58kB
...
=> [linux/amd64 1/11] FROM docker.io/library/postgres:15.1-alpine@sha256:f19eede5a214c0933dce30c2e734b787b4c09193e874cce3b26c5d54b8b77ec7 18.2s
...
=> [linux/arm64 1/11] FROM docker.io/library/postgres:15.1-alpine@sha256:f19eede5a214c0933dce30c2e734b787b4c09193e874cce3b26c5d54b8b77ec7 28.0s
WARNING: No output specified with docker-container driver. Build result will only remain in the build cache. To push result image into registry use --push or to load image into docker use --load
Again, this does not push the image. If you prefer, you can add --push
to the command above and push the image at this
time
rather than as a separate step below.
If you did not utilize --push
with docker buildx
when building the alpine image, push it now to Docker Hub.
docker push gcr.io/mirrornode/citus:12.1.1