Skip to content

Commit

Permalink
add: discv4 crawling support
Browse files Browse the repository at this point in the history
  • Loading branch information
dennis-tra committed Oct 18, 2024
1 parent 38e4489 commit cd22ff0
Show file tree
Hide file tree
Showing 50 changed files with 1,103 additions and 7,144 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/pull_request_main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ jobs:
- name: Setting up Golang
uses: actions/setup-go@v4
with:
go-version: '1.23.1'
go-version: '1.23'

- name: Running tests
run: make test
2 changes: 1 addition & 1 deletion .github/workflows/push_main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ jobs:
- name: Setting up Golang
uses: actions/setup-go@v4
with:
go-version: '1.23.1'
go-version: '1.23'

- name: Running tests
run: make test
Expand Down
4 changes: 2 additions & 2 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -30,8 +30,8 @@ docker-push: docker-linux

tools:
go install -tags 'postgres' github.com/golang-migrate/migrate/v4/cmd/[email protected]
go install github.com/volatiletech/sqlboiler/v4@v4.13.0
go install github.com/volatiletech/sqlboiler/v4/drivers/sqlboiler-psql@v4.13.0
go install github.com/volatiletech/sqlboiler/v4@v4.14.1
go install github.com/volatiletech/sqlboiler/v4/drivers/sqlboiler-psql@v4.14.1
go install go.uber.org/mock/[email protected]

database-reset: database-stop databased migrate-up models
Expand Down
62 changes: 51 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,28 +12,34 @@ A network agnostic DHT crawler and monitor. The crawler connects to [DHT](https:

- [IPFS](https://ipfs.network) - [_Amino DHT_](https://blog.ipfs.tech/2023-09-amino-refactoring/)
- [Ethereum](https://ethereum.org/en/) - [_Consensus Layer_](https://ethereum.org/uz/developers/docs/networking-layer/#consensus-discovery)
- [Ethereum](https://ethereum.org/en/) - [_Testnet Holesky_](https://github.com/eth-clients/holesky) (alpha)
- [Ethereum](https://ethereum.org/en/) - [_Execution Layer_](https://ethereum.org/uz/developers/docs/networking-layer/#discovery)
- [Filecoin](https://filecoin.io)
- [Polkadot](https://polkadot.network/)
- [Kusama](https://kusama.network/)
- [Rococo](https://substrate.io/developers/rococo-network/)
- [Westend](https://wiki.polkadot.network/docs/maintain-networks#westend-test-network)
- [Avail](https://www.availproject.org/)
- [Celestia](https://celestia.org/) - [_Mainnet_](https://blog.celestia.org/celestia-mainnet-is-live/)
- [Celestia](https://celestia.org/) - [_Arabica_](https://github.com/celestiaorg/celestia-node/blob/9c0a5fb0626ada6e6cdb8bcd816d01a3aa5043ad/nodebuilder/p2p/bootstrap.go#L40)
- [Celestia](https://celestia.org/) - [_Mocha_](https://docs.celestia.org/nodes/mocha-testnet)
- [Pactus](https://pactus.org)

_The crawler was:_
The crawler was:

- 🏆 _awarded a prize in the [DI2F Workshop hackathon](https://research.protocol.ai/blog/2021/decentralising-the-internet-with-ipfs-and-filecoin-di2f-a-report-from-the-trenches/)._ 🏆
- 🎓 _used for the ACM SigCOMM'22 paper [Design and Evaluation of IPFS: A Storage Layer for the Decentralized Web](https://research.protocol.ai/publications/design-and-evaluation-of-ipfs-a-storage-layer-for-the-decentralized-web/trautwein2022.pdf)_ 🎓

📊 [ProbeLab](https://probelab.io) is publishing weekly reports for the IPFS Amino DHT based on the crawl results [here](https://github.com/protocol/network-measurements/tree/master/reports)! 📊
Nebula powers:
- 📊 _the weekly reports for the IPFS Amino DHT [here](https://github.com/protocol/network-measurements/tree/master/reports)!_ 📊
- 🌐 _many graphs on [probelab.io](https://probelab.io) for most of the supported networks above_ 🌐

📺 You can find a demo on YouTube: [Nebula: A Network Agnostic DHT Crawler](https://www.youtube.com/watch?v=QDgvCBDqNMc) 📺

You can find a demo on YouTube: [Nebula: A Network Agnostic DHT Crawler](https://www.youtube.com/watch?v=QDgvCBDqNMc) 📺

![Screenshot from a Grafana dashboard](./docs/grafana-screenshot.png)

<small>_Grafana Dashboard is not part of this repository_</small>

## Table of Contents

- [Table of Contents](#table-of-contents)
Expand Down Expand Up @@ -156,6 +162,8 @@ nebula --db-user nebula_test --db-name nebula_test monitor

When Nebula is configured to store its results in a postgres database, then it also tracks session information of remote peers. A session is one continuous streak of uptime (see below).

However, this is not implemented for all supported networks. The [ProbeLab](https://probelab.network) team is using the monitoring feature for the IPFS, Celestia, Filecoin, and Avail networks. Most notably, the Ethereum discv4/discv5 monitoring implementation still needs some work.

---

There are a few more command line flags that are documented when you run`nebula --help` and `nebula crawl --help`:
Expand All @@ -170,8 +178,6 @@ random `PeerIDs` with common prefix lengths (CPL) that fall each peers buckets,
closer (XOR distance) to the ones `nebula` just constructed. This will effectively yield a list of all `PeerIDs` that a peer has
in its routing table. The process repeats for all found peers until `nebula` does not find any new `PeerIDs`.

This process is heavily inspired by the `basic-crawler` in [libp2p/go-libp2p-kad-dht](https://github.com/libp2p/go-libp2p-kad-dht/tree/master/crawler) from [@aschmahmann](https://github.com/aschmahmann).

If Nebula is configured to store its results in a database, every peer that was visited is written to it. The visit information includes latency measurements (dial/connect/crawl durations), current set of multi addresses, current agent version and current set of supported protocols. If the peer was dialable `nebula` will
also create a `session` instance that contains the following information:

Expand Down Expand Up @@ -223,7 +229,8 @@ CREATE TABLE sessions (

At the end of each crawl `nebula` persists general statistics about the crawl like the total duration, dialable peers, encountered errors, agent versions etc...

> **Info:** You can use the `crawl` sub-command with the global `--dry-run` option that skips any database operations.
> [!TIP]
> You can use the `crawl` sub-command with the global `--dry-run` option that skips any database operations.
Command line help page:

Expand Down Expand Up @@ -296,10 +303,10 @@ OPTIONS:

## Development

To develop this project, you need Go `1.19` and the following tools:
To develop this project, you need Go `1.23` and the following tools:

- [`golang-migrate/migrate`](https://github.com/golang-migrate/migrate) to manage the SQL migration `v4.15.2`
- [`volatiletech/sqlboiler`](https://github.com/volatiletech/sqlboiler) to generate Go ORM `v4.14.2`
- [`volatiletech/sqlboiler`](https://github.com/volatiletech/sqlboiler) to generate Go ORM `v4.14.1`
- `docker` to run a local postgres instance

To install the necessary tools you can run `make tools`. This will use the `go install` command to download and install the tools into your `$GOPATH/bin` directory. So make sure you have it in your `$PATH` environment variable.
Expand All @@ -312,7 +319,8 @@ You need a running postgres instance to persist and/or read the crawl results. R
docker run --rm -p 5432:5432 -e POSTGRES_PASSWORD=password -e POSTGRES_USER=nebula_test -e POSTGRES_DB=nebula_test --name nebula_test_db postgres:14
```

> **Info:** You can use the `crawl` sub-command with the global `--dry-run` option that skips any database operations or store the results as JSON files with the `--json-out` flag.
> [!TIP]
> You can use the `crawl` sub-command with the global `--dry-run` option that skips any database operations or store the results as JSON files with the `--json-out` flag.
The default database settings for local development are:

Expand Down Expand Up @@ -350,7 +358,7 @@ migrate create -ext sql -dir pkg/db/migrations -seq some_migration_name
To run the tests you need a running test database instance:

```shell
make database
make database # or make databased (note the d suffix for "daemon") to start the DB in the background
make test
```

Expand All @@ -376,6 +384,38 @@ The following presentation shows a ways to use Nebula by showcasing crawls of th

[![Nebula: A Network Agnostic DHT Crawler - Dennis Trautwein](https://img.youtube.com/vi/QDgvCBDqNMc/0.jpg)](https://www.youtube.com/watch?v=QDgvCBDqNMc)

## Networks

> [!NOTE]
> This section is work-in-progress and doesn't include information about all networks yet.
The following sections document our experience with crawling the different networks.

### Ethereum Execution (disv4)

Under the hood Nebula uses packages from [`go-ethereum`](https://github.com/ethereum/go-ethereum) to facilitate peer
communication. Mostly, Nebula relies on the [discover package](https://github.com/ethereum/go-ethereum/tree/master/p2p/discover).
However, we made quite a few changes to the implementation that can be found in
our fork of `go-ethereum` [here](https://github.com/probe-lab/go-ethereum/tree/nebula) in the `nebula` branch.

Most notably, the custom changes include:

- export of internal constants, functions, methods and types to customize their behaviour or call them directly
- changes to the response matcher logic. UDP packets won't be forwarded to all matchers. This was required so that
concurrent requests to the same peer don't lead to unhandled packets

Deployment recommendations:

- CPUs: 4 (better 8)
- Memory > 4 GB
- UDP Read Buffer size >1 MiB (better 4 MiB) via the `--udp-buffer-size=4194304` command line flag or corresponding environment variable `NEBULA_UDP_BUFFER_SIZE`.
You might need to adjust the maximum buffer size on Linux, so that the flag takes effect:
```shell
sysctl -w net.core.rmem_max=8388608 # 8MiB
```
- UDP Response timeout of `3s` (default)
- Workers: 3000

## Maintainers

[@dennis-tra](https://github.com/dennis-tra).
Expand Down
18 changes: 14 additions & 4 deletions cmd/nebula/cmd.go
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ const (
flagCategoryDatabase = "Database Configuration:"
flagCategoryDebugging = "Debugging Configuration:"
flagCategoryCache = "Cache Configuration:"
flagCategorySystem = "System Configuration:"
flagCategoryNetwork = "Network Specific Configuration:"
)

Expand Down Expand Up @@ -55,10 +56,11 @@ var rootConfig = &config.Root{
ProtocolsCacheSize: 100,
ProtocolsSetCacheSize: 200,
},
RawVersion: version,
BuildCommit: commit,
BuildDate: date,
BuiltBy: builtBy,
UDPBufferSize: 1024 * 1024,
RawVersion: version,
BuildCommit: commit,
BuildDate: date,
BuiltBy: builtBy,
}

func main() {
Expand Down Expand Up @@ -218,6 +220,14 @@ func main() {
Destination: &rootConfig.Database.DatabaseSSLMode,
Category: flagCategoryDatabase,
},
&cli.IntFlag{
Name: "udp-buffer-size",
Usage: "The rcv/snd buffer size for the UDP sockets (in bytes)",
EnvVars: []string{"NEBULA_UDP_BUFFER_SIZE"},
Value: rootConfig.UDPBufferSize,
Destination: &rootConfig.UDPBufferSize,
Category: flagCategorySystem,
},
&cli.IntFlag{
Name: "agent-versions-cache-size",
Usage: "The cache size to hold agent versions in memory",
Expand Down
57 changes: 35 additions & 22 deletions cmd/nebula/cmd_crawl.go
Original file line number Diff line number Diff line change
Expand Up @@ -29,19 +29,20 @@ import (
)

var crawlConfig = &config.Crawl{
Root: rootConfig,
CrawlWorkerCount: 1000,
WriteWorkerCount: 10,
CrawlLimit: 0,
PersistNeighbors: false,
FilePathUdgerDB: "",
Network: string(config.NetworkIPFS),
BootstrapPeers: cli.NewStringSlice(),
Protocols: cli.NewStringSlice(string(kaddht.ProtocolDHT)),
AddrTrackTypeStr: "public",
AddrDialTypeStr: "public",
KeepENR: false,
CheckExposed: false,
Root: rootConfig,
CrawlWorkerCount: 1000,
WriteWorkerCount: 10,
CrawlLimit: 0,
PersistNeighbors: false,
FilePathUdgerDB: "",
Network: string(config.NetworkIPFS),
BootstrapPeers: cli.NewStringSlice(),
Protocols: cli.NewStringSlice(string(kaddht.ProtocolDHT)),
AddrTrackTypeStr: "public",
AddrDialTypeStr: "public",
KeepENR: false,
CheckExposed: false,
Discv4RespTimeout: 3 * time.Second,
}

// CrawlCommand contains the crawl sub-command configuration.
Expand Down Expand Up @@ -183,6 +184,14 @@ var CrawlCommand = &cli.Command{
Destination: &crawlConfig.KeepENR,
Category: flagCategoryNetwork,
},
&cli.DurationFlag{
Name: "udp-response-timeout",
Usage: "ETHEREUM_EXECUTION: The response timeout for UDP requests in the disv4 DHT",
EnvVars: []string{"NEBULA_CRAWL_UDP_RESPONSE_TIMEOUT"},
Value: crawlConfig.Discv4RespTimeout,
Destination: &crawlConfig.Discv4RespTimeout,
Category: flagCategoryNetwork,
},
},
}

Expand Down Expand Up @@ -271,15 +280,19 @@ func CrawlAction(c *cli.Context) error {

// configure the crawl driver
driverCfg := &discv4.CrawlDriverConfig{
Version: cfg.Root.Version(),
DialTimeout: cfg.Root.DialTimeout,
TrackNeighbors: cfg.PersistNeighbors,
BootstrapPeers: bpEnodes,
AddrDialType: cfg.AddrDialType(),
AddrTrackType: cfg.AddrTrackType(),
TracerProvider: cfg.Root.TracerProvider,
MeterProvider: cfg.Root.MeterProvider,
LogErrors: cfg.Root.LogErrors,
Version: cfg.Root.Version(),
DialTimeout: cfg.Root.DialTimeout,
CrawlWorkerCount: cfg.CrawlWorkerCount,
TrackNeighbors: cfg.PersistNeighbors,
BootstrapPeers: bpEnodes,
AddrDialType: cfg.AddrDialType(),
AddrTrackType: cfg.AddrTrackType(),
TracerProvider: cfg.Root.TracerProvider,
MeterProvider: cfg.Root.MeterProvider,
LogErrors: cfg.Root.LogErrors,
KeepENR: cfg.KeepENR,
UDPBufferSize: cfg.Root.UDPBufferSize,
UDPRespTimeout: cfg.Discv4RespTimeout,
}

// init the crawl driver
Expand Down
2 changes: 1 addition & 1 deletion cmd/prefix/gen.go
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ func main() {
}

fmt.Println("writing prefixes...")
f, err := os.Create("discvx/prefixmap.go")
f, err := os.Create("discv4/prefixmap.go")
if err != nil {
panic(err)
}
Expand Down
2 changes: 1 addition & 1 deletion config/bootstrap.go
Original file line number Diff line number Diff line change
Expand Up @@ -248,7 +248,7 @@ var (
"/dns/bootnode.1.lightclient.mainnet.avail.so/tcp/37000/p2p/12D3KooW9x9qnoXhkHAjdNFu92kMvBRSiFBMAoC5NnifgzXjsuiM",
}

//BootstrapPeersAvailTuringLightClient
// BootstrapPeersAvailTuringLightClient
BootstrapPeersAvailTuringLightClient = []string{
"/dns/bootnode.1.lightclient.turing.avail.so/tcp/37000/p2p/12D3KooWBkLsNGaD3SpMaRWtAmWVuiZg1afdNSPbtJ8M8r9ArGRT",
}
Expand Down
6 changes: 6 additions & 0 deletions config/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -114,6 +114,9 @@ type Root struct {
// TracerProvider is the tracer provider to use when initialising tracing
TracerProvider trace.TracerProvider

// The buffer size of the UDP sockets (applicable to ETHEREUM_{CONSENSUS,EXECUTION)
UDPBufferSize int

// The raw version of Nebula in the for X.Y.Z. Raw, because it's missing, e.g., commit information (set by GoReleaser or in Makefile)
RawVersion string

Expand Down Expand Up @@ -278,6 +281,9 @@ type Crawl struct {

// Whether to keep the full enr record alongside all parsed kv-pairs
KeepENR bool

// The UDP response timeout when crawling the disv4 DHT
Discv4RespTimeout time.Duration
}

func (c *Crawl) AddrTrackType() AddrType {
Expand Down
2 changes: 1 addition & 1 deletion core/engine.go
Original file line number Diff line number Diff line change
Expand Up @@ -445,7 +445,7 @@ func (e *Engine[I, R]) handleWriteResult(ctx context.Context, result Result[Writ
"success": result.Value.Error == nil,
"written": e.writeCount,
"duration": result.Value.Duration,
}).Infoln("Handled writer result")
}).Debugln("Handled writer result")
}

// reachedProcessingLimit returns true if the processing limit is configured
Expand Down
2 changes: 1 addition & 1 deletion core/handler_crawl.go
Original file line number Diff line number Diff line change
Expand Up @@ -70,10 +70,10 @@ func (r CrawlResult[I]) PeerInfo() I {

func (r CrawlResult[I]) LogEntry() *log.Entry {
logEntry := log.WithFields(log.Fields{
"crawlerID": r.CrawlerID,
"remoteID": r.Info.ID().ShortString(),
"isDialable": r.ConnectError == nil && r.CrawlError == nil,
"duration": r.CrawlDuration(),
"rtSize": len(r.RoutingTable.Neighbors),
})

if r.ConnectError != nil {
Expand Down
35 changes: 35 additions & 0 deletions db/errors.go
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,24 @@ var KnownErrors = map[string]string{
"connection gated": models.NetErrorConnectionGated, // transient error
"RESOURCE_LIMIT_EXCEEDED (201)": models.NetErrorCantConnectOverRelay, // transient error
"NO_RESERVATION (204)": models.NetErrorCantConnectOverRelay, // permanent error
// devp2p errors
"no good ip address": models.NetErrorNoIPAddress,
"disconnect requested": models.NetErrorDevp2pDisconnectRequested,
"network error": models.NetErrorDevp2pNetworkError,
"breach of protocol": models.NetErrorDevp2pBreachOfProtocol,
"useless peer": models.NetErrorDevp2pUselessPeer,
"too many peers": models.NetErrorDevp2pTooManyPeers,
"already connected": models.NetErrorDevp2pAlreadyConnected,
"incompatible p2p protocol version": models.NetErrorDevp2pIncompatibleP2PProtocolVersion,
"invalid node identity": models.NetErrorDevp2pInvalidNodeIdentity,
"client quitting": models.NetErrorDevp2pClientQuitting,
"unexpected identity": models.NetErrorDevp2pUnexpectedIdentity,
"connected to self": models.NetErrorDevp2pConnectedToSelf,
"read timeout": models.NetErrorDevp2pReadTimeout,
"subprotocol error": models.NetErrorDevp2pSubprotocolError,
"could not negotiate eth protocol": models.NetErrorDevp2pEthprotocolError,
"handshake failed: EOF": models.NetErrorDevp2pHandshakeEOF, // dependent on error string in discv4
"malformed disconnect message": models.NetErrorDevp2pMalformedDisconnectMessage, // dependent on error string in discv4
}

var ErrorStr = map[string]string{}
Expand Down Expand Up @@ -79,7 +97,24 @@ var knownErrorsPrecedence = []string{
"Write on stream",
"RESOURCE_LIMIT_EXCEEDED (201)",
"NO_RESERVATION (204)",
"too many peers",
"no good ip address",
"malformed disconnect message",
"handshake did not complete in time",
"disconnect requested",
"network error",
"breach of protocol",
"useless peer",
"already connected",
"incompatible p2p protocol version",
"invalid node identity",
"client quitting",
"unexpected identity",
"connected to self",
"read timeout",
"subprotocol error",
"could not negotiate eth protocol",
"handshake failed: EOF",
}

// NetError extracts the appropriate error type from the given error.
Expand Down
1 change: 1 addition & 0 deletions db/migrations/000028_add_net_errors.down.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
-- no down migration
Loading

0 comments on commit cd22ff0

Please sign in to comment.