diff --git a/css/guide.css b/css/guide.css index faf93c9..8ed93e2 100644 --- a/css/guide.css +++ b/css/guide.css @@ -300,6 +300,38 @@ table.defs td:first-child { padding-right: 1rem; } +table.clock-values { + font-size: 13px; + margin: 1rem 0 2rem; + background-color: #f9d2e9; +} + +table.clock-values td, th { + text-align: center; + vertical-align: middle; + border-right: solid thin #fff; +} + +table.clock-values thead { + color: #fff; +} + +table.clock-values tr { + border-bottom: solid thin #fff; +} + +table.clock-values thead tr:nth-child(1) { + background-color: #ca82c9; +} + +table.clock-values thead tr:nth-child(2) { + background-color: #f29bce; +} + +table.clock-values tbody td:first-child { + font-weight: bold; +} + .small { font-size: 13px; } @@ -355,7 +387,7 @@ table.defs td:first-child { grid-column: right; } -.request, .response, .request-arrow, .response-arrow { +.request, .response, .request-arrow, .response-arrow, .clock-table { margin: 12px 0; } diff --git a/index.html b/index.html index c5465d1..af11adb 100644 --- a/index.html +++ b/index.html @@ -54,6 +54,7 @@

Contents

Structure
Message format
createHistoryStream
+
EBT replication
@@ -1086,6 +1087,188 @@
Implementations
}
+ +

Epidemic Broadcast Tree (EBT) Replication

+ +

In addition to classic replication using createHistoryStream, some Scuttlebutt clients implement a more efficient form of replication known as Epidemic broadcast tree replication. This is often referred to by the abbreviation EBT. The implementation of EBT used in Scuttlebutt is loosely based on the push-lazy-push multicast tree protocol, more commonly known as the Plumtree protocol [1].

+

Session Initiation

+

An EBT session may be initiated once two peers have completed the secret handshake and have established their respective box streams. The peer who acted as the client during the secret handshake takes on the role of the requester, sending an ["ebt", "replicate"] request to the connected peer.

+
+
+
+ Request number1 + Body typeJSON + StreamYes + End/errNo +
+
{
+  "name": ["ebt", "replicate"],
+  "type": "duplex",
+  "args": [
+    {
+      "version": 3,
+      "format": "classic"
+    }
+  ]
+}
+
+ +
+

The peer who acted as the server during the secret handshake takes on the role of the responder. After having received the replicate request, the responder first validates the arguments to ensure that the version is 3 and the format is "classic". If either of those values are incorrect, the responder terminates the stream with an error.

+

Vector Clocks

+

The responder then sends a vector clock (also known as a "note" or "control message") to the requester. The vector clock takes the form of a JSON object with one or more key-value pairs. The key of each pair specifies a Scuttlebutt feed identified by the @-prefixed public key of the author. The value of each pair is a signed integer encoding a replicate flag, a receive flag and a feed sequence number. +

+ +
+
+ Request number-1 + Body typeJSON + StreamYes + End/errNo +
+
{
+  "@qK93G/R9R5J2fiqK+kxV72HqqPUcss+rth8rACcYr4s=.ed25519": 450,
+  "@L/g6qZQE/2FdO2UhSJ0uyDiZb5LjJLatM/d8MN+INSM=.ed25519": 12,
+  "@fGHFd8rUgoznX/qS/1U7HPF3vnirbSyfaaWlS8cCWR0=.ed25519": 1,
+  "@HEqy940T6uB+T+d9Jaa58aNfRzLx9eRWqkZljBmnkmk=.ed25519": -1
+}
+
+
+

The requester terminates the stream with an error if any of the received feed identifiers or encoded values are malformed. If the received vector clock is valid, the requester can proceed with decoding the values.

+

The value in each key-value pair of a vector clock encodes a maximum of three data points: a replicate flag, a receive flag and a sequence number. A negative value (usually -1) signals that the responder does not wish to replicate the associated feed, neither sending nor receiving messages. In this scenario, the replicate flag is set to false and both the receive flag and sequence number are irrelevant.

+

A positive value signals that the responder wishes to replicate the associated feed. If the value is positive it should be decoded as follows. First, the JSON number is parsed and converted to a signed integer. Then, the rightmost (lowest order) bit of the number is interpreted as a binary flag with 0 equal to true and 1 equal to false. This flag is referred to as the receive flag. Next, a sign-extending right shift (also called arithmetic right shift) by 1 bit is performed on the binary number, therefore discarding the rightmost (lowest order) bit. The remaining number is then interpreted as a sequence number for the associated feed.

+

If the receive flag is set to true, the peer who sent the vector clock wishes to receive messages for the associated feed. The decoded sequence number defines the latest message held by the peer for that feed.

+

Encoding of a vector clock value involves reversing the steps outlined above. If the peer does not wish to replicate a feed, the value is simply set to -1. Otherwise, the latest sequence number of the associated feed is stored as a signed integer and an arithmetic left shift is performed. The rightmost (lowest order) bit is then set according to the replicate flag as described previously.

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
EncodedDecoded
Replicate flagReceive flagSequence
-1FalseIrrelevantIrrelevant
0TrueTrue0
1TrueFalse0
2TrueTrue1
3TrueFalse1
12TrueTrue6
450TrueTrue225
+

The requester then sends their own vector clock to the responder. At this point, the initial exchange of vector clocks is complete and both peers may begin sending messages at will. Updated vector clocks may continue to be sent by both peers at any point during the session. These updated clocks may reference a subset of the feeds represented in the initial vector clock, or they may reference different feeds entirely. This provides a means for responding to state changes in the local database and follow graph.

+ +
+
+
+ Request number1 + Body typeJSON + StreamYes + End/errNo +
+
{
+  "@fGHFd8rUgoznX/qS/1U7HPF3vnirbSyfaaWlS8cCWR0=.ed25519": 62,
+  "@qK93G/R9R5J2fiqK+kxV72HqqPUcss+rth8rACcYr4s=.ed25519": -1,
+  "@L/g6qZQE/2FdO2UhSJ0uyDiZb5LjJLatM/d8MN+INSM=.ed25519": 10
+}
+
+ +
+

Messages are sent in exactly the same way as when responding to a createHistoryStream request. +

+ +
+
+ Request number-1 + Body typeJSON + StreamYes + End/errNo +
+
{
+  "previous": "%GvRqbiZFY0cNyGRD4QwjeMQvnrjz9vMPBsT5JdWrcW0=.sha256",
+  "author": "@L/g6qZQE/2FdO2UhSJ0uyDiZb5LjJLatM/d8MN+INSM=.ed25519",
+  "sequence": 5,
+  "timestamp": 1698824093970,
+  "hash": "sha256",
+  "content": {
+    "type": "contact",
+    "contact": "@fGHFd8rUgoznX/qS/1U7HPF3vnirbSyfaaWlS8cCWR0=.ed25519",
+    "blocking": false,
+    "following": true
+  },
+  "signature": "8+VQ7sbv0fnD8ZnbunJ19fyvtcwSpHvhlWUakj32nU4woFNNpI
+                qpvkAJ4GMGJdYHoqc8C7asPXPa+wMzbPR1Cw==.sig.ed25519"
+}
+
+
+

Session Termination

+

An EBT session may be terminated by either peer at any point, either by sending an error response or by closing the stream. If no error has occurred, the stream is closed when a peer wishes to conclude the session (as described in the Source example of the RPC protocol section above).

+

Request Skipping

+

EBT implementations rely on a mechanism known as request skipping to lower bandwidth overhead and increase replication efficiency. Each peer stores the vector clocks they receive from remote peers; these may be held in memory and persisted to disk to allow later retrieval. When a subsequent EBT session is initiated between peers, each peer first checks the stored vector clock of their remote peer before calculating an updated vector clock to be sent. If the latest locally-available sequence of a feed from the remote peer's vector clock is the same as the sequence in the saved vector clock for that peer, that feed is left out of the new vector clock in the outgoing request (hence the name request skipping). This provides a mechanism for limiting the total number of bytes to be sent over the wire.

+ +

The stored vector clock for the remote peer may differ from their current vector clock. In that case, the remote peer will include the updated feed in their request and the local peer will respond by sending an additional partial vector clock including their sequence for that feed. Once both sides have exchanged their sequence for a particular feed, replication of messages in that feed may occur.

+

Vector Clock Partitioning

+

In order to further increase efficiency when connecting to multiple peers, feeds for which the local peer would like to receive updates are only sent to one peer at a time (in the outbound vector clock). A timeout may be used to request the feed from an alternate peer if no updates are available from the initial peer. In this way, the total set of requested feeds is spread across multiple peers.

+

Fallback Mechanisms

+

EBT is the preferred means for peers to exchange messages. However, not all Scuttlebutt clients support EBT replication. In the case that only one of two connected peers support EBT, both peers may instead fallback to using createHistoryStream to exchange messages. There are several scenarios which may trigger initiation of createHistoryStream replication:

+ +

[1] Joao Leitao, Jose Pereira and Luis Rodrigues. 2007. Epidemic Broadcast Trees. In 2007 26th IEEE International Symposium on Reliable Distributed Systems (SRDS 2007), 301-310. https://doi.org/10.1109/SRDS.2007.27

Metafeeds