Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

regression and benchmark testing between jerigon and zkevm #751

Open
wants to merge 2 commits into
base: develop
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
102 changes: 102 additions & 0 deletions .github/workflows/cron_jerigon_zero_testing.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,102 @@
name: Jerigon Zero Testing

on:
# Uncomment when ready to run on a schedule
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO

# schedule:
# # Run every Sunday at 12:00 AM (UTC)
# - cron: "0 0 * * 0"
push:
branches: [develop]
pull_request:
branches:
- "**"
Comment on lines +8 to +12
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be removed when PR is debugged, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, this is just for testing

workflow_dispatch:
branches:
- "**"

env:
CARGO_TERM_COLOR: always
REGISTRY: ghcr.io

jobs:
jerigon_zero_testing:
name: Jerigon Zero Testing - Integration and Benchmarking
runs-on: zero-ci
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

zero-ci is shared among other tasks. Ben created zero-reg - stronger, and for use only with benchmarks. We should just make sure that we schedule checks at different time.

steps:
- name: Checkout zk_evm code
uses: actions/checkout@v4

- name: Setup Rust Toolchain
uses: actions-rust-lang/setup-rust-toolchain@v1

- name: Set up Rust Cache
uses: Swatinem/rust-cache@v2
with:
cache-on-failure: true

- name: Build the project
run: |
RUSTFLAGS='-C target-cpu=native -Zlinker-features=-lld -Copt-level=3' cargo build --release
sudo sysctl kernel.perf_event_paranoid=0

- name: Set up QEMU
uses: docker/setup-qemu-action@v3

- name: Login to GitHub Container Registry
uses: docker/login-action@v2
with:
registry: ${{ env.REGISTRY }}
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}

- name: Run Erigon Network
run: |
cd ..
tar xf "$(pwd)/zk_evm/test_data/erigon-data.tar.gz" || {
Copy link
Contributor Author

@temaniarpit27 temaniarpit27 Oct 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Nashtare @praetoriansentry what would be best way to store the file? currently added with repo

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not big, I think it can stay with repo.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we are using a shorter data (672) blocks, if we use that, its probably fine but if we choose to go with a larger set of data, we may need to think

echo "Failed to extract erigon-data.tar.gz"; exit 1;
}
docker pull ghcr.io/0xpolygonzero/erigon:feat-zero
docker run -d --name erigon \
-p 8545:8545 \
-v $(pwd):/data \
ghcr.io/0xpolygonzero/erigon:feat-zero \
--datadir=/data/erigon/execution-data \
--http.api=eth,erigon,engine,web3,net,debug,trace,txpool,admin \
--http.vhosts=* --ws --http --http.addr=0.0.0.0 \
--http.corsdomain=* --http.port=8545 \
--metrics --metrics.addr=0.0.0.0 --metrics.port=9001 \
--db.size.limit=3000MB || {
echo "Failed to start Erigon"; exit 1;
}

- name: Regression Test with Zero Tracer in Real Mode
run: |
export ETH_RPC_URL="http://localhost:8545"
rm -rf proofs/* circuits/* ./proofs.json test.out verify.out leader.out
random_numbers=($(shuf -i 1-500 -n 5))
Copy link
Contributor Author

@temaniarpit27 temaniarpit27 Oct 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Nashtare @praetoriansentry currently using random 5 blocks for integration testing, any pref for block count

Copy link
Contributor

@atanmarko atanmarko Oct 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it is a good idea to take random 5 blocks. Should be all 500 probably, if we run it once a week. Maybe 10 or 100, if 500 is too much.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ya its just for testing, we will choose whatever we think is fine. i was thinking somewhere around 100

for number in "${random_numbers[@]}"; do
hex_number="0x$(echo "obase=16; $number" | bc)"
OUTPUT_TO_TERMINAL=true RUN_VERIFICATION=true ./scripts/prove_rpc.sh $hex_number $hex_number $ETH_RPC_URL jerigon true 3000 100
done

- name: Download Previous Results
uses: dawidd6/action-download-artifact@v6
with:
workflow: cron_jerigon_zero_testing.yml
workflow_conclusion: success
name: jerigon_zero_benchmark
path: ./
if_no_artifact_found: ignore

- name: Run the Benchmark Script
run: |
export ETH_RPC_URL="http://localhost:8545"
./scripts/jerigon_zero_benchmark.sh

- name: Upload New Results
uses: actions/upload-artifact@v4
with:
name: jerigon_zero_benchmark
path: ./jerigon_zero_output.log
retention-days: 90
overwrite: true
89 changes: 89 additions & 0 deletions scripts/jerigon_zero_benchmark.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
#!/bin/bash
# ------------------------------------------------------------------------------
set -exo pipefail

# Args:
# 1 --> Output file (Not used in the current script)

# Get the number of processors for parallelism
if [[ "$OSTYPE" == "darwin"* ]]; then
num_procs=$(sysctl -n hw.physicalcpu)
else
num_procs=$(nproc)
fi

# Force the working directory to always be the repository root.
REPO_ROOT=$(git rev-parse --show-toplevel)
PROOF_OUTPUT_DIR="${REPO_ROOT}/proofs"
BLOCK_BATCH_SIZE="${BLOCK_BATCH_SIZE:-8}"

# Logging setup
OUTPUT_LOG="jerigon_zero_output.log"
BLOCK_OUTPUT_LOG="jerigon_zero_block_output.log"
PROOFS_FILE_LIST="${PROOF_OUTPUT_DIR}/proof_files.json"

# Ensure necessary directories exist
mkdir -p "$PROOF_OUTPUT_DIR"

# Set environment variables for parallelism and logging
export RAYON_NUM_THREADS=$num_procs
export TOKIO_WORKER_THREADS=$num_procs
export RUST_MIN_STACK=33554432
export RUST_BACKTRACE=full
export RUST_LOG=info

# Log the current date and time
echo "$(date +"%Y-%m-%d %H:%M:%S")" &>> "$OUTPUT_LOG"

# Define the blocks to process
blocks=(100 200 300 400 500)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@praetoriansentry @Nashtare currently using random blocks, any idea which blocks to use for benchmark testing

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could read the list from the config file (CSV to make it simple?).

Copy link
Contributor Author

@temaniarpit27 temaniarpit27 Oct 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes this is just a placeholder. we will choose block numbers first and if thats too much, we will put that in config file


# Function to process each block
process_block() {
local block=$1

echo "Processing block: $block" &>> "$OUTPUT_LOG"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would keep the results file minimal - maybe one line with date/time of measurement, and the timings for the blocks. Would be easy that way to open file and see how the performances are changed over weeks and months.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#751 (comment)

this is the expected output
let me know if we need to reduce more. i think this also gives clear info along with being concise

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line Processing block: 1 is redundant, does not bring any new information.


# Fetch block data
if ! ./target/release/rpc --rpc-url "$ETH_RPC_URL" fetch --start-block "$block" --end-block "$block" > "output_${block}.json"; then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

output_${block}.json is a bit misleading. Maybe witness_${block}.json?

echo "Failed to fetch block data for block: $block" &>> "$OUTPUT_LOG"
exit 1
fi

local start_time=$(date +%s%N)

# Run performance stats
if ! perf stat -e cycles ./target/release/leader --runtime in-memory --load-strategy monolithic --block-batch-size "$BLOCK_BATCH_SIZE" --proof-output-dir "$PROOF_OUTPUT_DIR" stdio < "output_${block}.json" &> "$BLOCK_OUTPUT_LOG"; then
echo "Performance command failed for block: $block" &>> "$OUTPUT_LOG"
cat "$BLOCK_OUTPUT_LOG" &>> "$OUTPUT_LOG"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would use one output file to append performance measurements, and the other for error log details. This way, any error that happens will dump big log to the previous performance results and would make them hard to read.

exit 1
fi

local end_time=$(date +%s%N)

set +o pipefail
if ! cat "$BLOCK_OUTPUT_LOG" | grep "Successfully wrote to disk proof file " | awk '{print $NF}' | tee "$PROOFS_FILE_LIST"; then
echo "Proof list not generated for block: $block. Check the log for details." &>> "$OUTPUT_LOG"
cat "$BLOCK_OUTPUT_LOG" &>> "$OUTPUT_LOG"
Copy link
Contributor

@atanmarko atanmarko Oct 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is $BLOCK_OUTPUT_LOG the whole leader output terminal log? We should just keep minimal details and time measurements, as we want to track changes over weeks - not to append whole debug/info leader log to the results file. One liner with the block that error-ed in results file, and separate file with whole error log.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is printed only in case of failure, if a block proving fails. otherwise we wont know why a block failed
if it succeeds, then we are printing only details about time measurements.
maybe, we can move failure blocks to a new file or something like that

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can print 'Proof list not generated for block: $block. Check the log for details."' so that we know the block failed. Then you check the error logs in the other file.

exit 1
fi

local duration_sec=$(echo "scale=3; ($end_time - $start_time) / 1000000000" | bc -l)

# Extract performance timings
local PERF_TIME=$(grep "seconds time elapsed" "$BLOCK_OUTPUT_LOG" | tail -1 | awk '{ print ($1)}')
local PERF_USER_TIME=$(grep "seconds user" "$BLOCK_OUTPUT_LOG" | tail -1 | awk '{ print ($1)}')
local PERF_SYS_TIME=$(grep "seconds sys" "$BLOCK_OUTPUT_LOG" | tail -1 | awk '{ print ($1)}')

echo "Success for block: $block!"
echo "Proving duration for block $block: $duration_sec seconds, performance time: $PERF_TIME, performance user time: $PERF_USER_TIME, performance system time: $PERF_SYS_TIME" &>> "$OUTPUT_LOG"
}

# Process each block
for block in "${blocks[@]}"; do
process_block "$block"
done

# Finalize logging
echo "Processing completed at: $(date +"%Y-%m-%d %H:%M:%S")" &>> "$OUTPUT_LOG"
echo "" &>> "$OUTPUT_LOG"
Binary file added test_data/erigon-data.tar.gz
Binary file not shown.
Loading