Notes about the architecture and usage of the relay.
See also:
This document covers more details about running a relay at scale: https://flashbots.notion.site/Draft-Running-a-relay-4040ccd5186c425d9a860cbb29bbfe09
The relay consists of three main components:
- Housekeeper: update known validators and proposer duties, and syncs DB->Redis on startup. Needs to run as single instance, will be replaced by cronjob in the future.
- Website: handles the root website requests (information is pulled from Redis and database).
- API: for proposer, block builder, data.
The API can run as a single instance, but for production can (and should) be deployed and scaled independently! These are the recommended deployments:
- Proposer API (registerValidator, getHeader, getPayload)
- Builder API (getValidatorDuties, submitNewBlock)
- Data API (read-only access to DB read replica)
- Internal API (setting builder status)
- Logs with level
error
are always system errors and something to investigate (never use the error level for bad request payloads or other user errors). - Put differently: if you want to make an error show up in the logs and dashboards, then use the
error
level!
- https://github.com/buger/jsonparser for really fast JSON request body processing
- First, Redis and Postgres have to be ready, as well as the beacon node(s)
- The housekeeper syncs important data from the beacon node and database to Redis
- The API needs access to the data in Redis to operate (i.e. all bids are going through Redis)
The housekeeper updates Redis with important information:
- Active and pending validators (source: beacon node)
- Proposer duties (source: beacon node (duties) + database (validator registrations))
- Validator registrations (source: database)
- Builder status (source: database)
Afterwards, there's important ongoing, regular housekeeper tasks:
- Update known validators and proposer duties in Redis
- Update active validators in database (source: Redis) (TODO)
- Validator registrations in are only saved to the database if
feeRecipient
orgasLimit
changed. If a registration has a newer timestamp but samefeeRecipient
andgasLimit
it is not saved, to avoid filling up the database with unnecessary data. (some CL clients create a new validator registration every epoch, not just if preferences change, as was the original idea).
A full infrastructure might include these components:
- Load balancer + Firewall
- 2x proposer API (4 CPU, 1GB RAM)
- 2x builder API (2-4 CPU, 1GB RAM)
- 2x data API (1 CPU, 1GB RAM)
- 2x website (1 CPU, 2GB RAM)
- 1x housekeeper (2 CPU, 1GB RAM)
- Redis (4GB)
- Postgres DB (100GB+)
- A bunch of beacon-nodes (3 for redundancy?)
- Block validation EL nodes
For more discussion about running a relay see also https://collective.flashbots.net/t/ideas-for-incentivizing-relays/586
- Use architecture decision records (ADRs) based on this template