From 7125076f344b75645639d892aef64d1389ac81d9 Mon Sep 17 00:00:00 2001 From: Anna Shaleva Date: Thu, 25 Apr 2024 20:03:58 +0300 Subject: [PATCH] formal-models: extend dBFT with additional post-commit phase Close #111. Signed-off-by: Anna Shaleva --- formal-models/dbft3.0/dbft.tla | 434 ++++++++++++++++++ .../dbft3.0/dbft___AllGoodModel.launch | 42 ++ 2 files changed, 476 insertions(+) create mode 100644 formal-models/dbft3.0/dbft.tla create mode 100644 formal-models/dbft3.0/dbft___AllGoodModel.launch diff --git a/formal-models/dbft3.0/dbft.tla b/formal-models/dbft3.0/dbft.tla new file mode 100644 index 00000000..cc5c47e9 --- /dev/null +++ b/formal-models/dbft3.0/dbft.tla @@ -0,0 +1,434 @@ +-------------------------------- MODULE dbft -------------------------------- + +EXTENDS + Integers, + FiniteSets + +CONSTANTS + \* RM is the set of consensus node indexes starting from 0. + \* Example: {0, 1, 2, 3} + RM, + + \* RMFault is a set of consensus node indexes that are allowed to become + \* FAULT in the middle of every considered behavior and to send any + \* consensus message afterwards. RMFault must be a subset of RM. An empty + \* set means that all nodes are good in every possible behaviour. + \* Examples: {0} + \* {1, 3} + \* {} + RMFault, + + \* RMDead is a set of consensus node indexes that are allowed to die in the + \* middle of every behaviour and do not send any message afterwards. RMDead + \* must be a subset of RM. An empty set means that all nodes are alive and + \* responding in in every possible behaviour. RMDead may intersect the + \* RMFault set which means that node which is in both RMDead and RMFault + \* may become FAULT and send any message starting from some step of the + \* particular behaviour and may also die in the same behaviour which will + \* prevent it from sending any message. + \* Examples: {0} + \* {3, 2} + \* {} + RMDead, + + \* MaxView is the maximum allowed view to be considered (starting from 0, + \* including the MaxView itself). This constraint was introduced to reduce + \* the number of possible model states to be checked. It is recommended to + \* keep this setting not too high (< N is highly recommended). + \* Example: 2 + MaxView + +VARIABLES + \* rmState is a set of consensus node states. It is represented by the + \* mapping (function) with domain RM and range RMStates. I.e. rmState[r] is + \* the state of the r-th consensus node at the current step. + rmState, + + \* msgs is the shared pool of messages sent to the network by consensus nodes. + \* It is represented by a subset of Messages set. + msgs + +\* vars is a tuple of all variables used in the specification. It is needed to +\* simplify fairness conditions definition. +vars == <> + +\* N is the number of validators. +N == Cardinality(RM) + +\* F is the number of validators that are allowed to be malicious. +F == (N - 1) \div 3 + +\* M is the number of validators that must function correctly. +M == N - F + +\* These assumptions are checked by the TLC model checker once at the start of +\* the model checking process. All the input data (declared constants) specified +\* in the "Model Overview" section must satisfy these constraints. +ASSUME + /\ RM \subseteq Nat + /\ N >= 4 + /\ 0 \in RM + /\ RMFault \subseteq RM + /\ RMDead \subseteq RM + /\ Cardinality(RMFault) <= F + /\ Cardinality(RMDead) <= F + /\ Cardinality(RMFault \cup RMDead) <= F + /\ MaxView \in Nat + /\ MaxView <= 2 + +\* RMStates is a set of records where each record holds the node state and +\* the node current view. +RMStates == [ + type: {"initialized", "prepareSent", "commitSent", "cv", "ackSent", "blockAccepted", "bad", "dead"}, + view : Nat + ] + +\* Messages is a set of records where each record holds the message type, +\* the message sender and sender's view by the moment when message was sent. +Messages == [type : {"PrepareRequest", "PrepareResponse", "Commit", "Ack", "ChangeView"}, rm : RM, view : Nat] + +\* -------------- Useful operators -------------- + +\* IsPrimary is an operator defining whether provided node r is primary +\* for the current round from the r's point of view. It is a mapping +\* from RM to the set of {TRUE, FALSE}. +IsPrimary(r) == rmState[r].view % N = r + +\* GetPrimary is an operator defining mapping from round index to the RM that +\* is primary in this round. +GetPrimary(view) == CHOOSE r \in RM : view % N = r + +\* GetNewView returns new view number based on the previous node view value. +\* Current specifications only allows to increment view. +GetNewView(oldView) == oldView + 1 + +\* CountCommitted returns the number of nodes that have sent the Commit message +\* in the current round or in some other round. +CountCommitted(r) == Cardinality({rm \in RM : Cardinality({msg \in msgs : msg.rm = rm /\ msg.type = "Commit"}) /= 0}) + +\* MoreThanFNodesCommitted returns whether more than F nodes have been committed +\* in the current round (as the node r sees it). +\* +\* IMPORTANT NOTE: we intentionally do not add the "lost" nodes calculation to the specification, and here's +\* the reason: from the node's point of view we can't reliably check that some neighbour is completely +\* out of the network. It is possible that the node doesn't receive consensus messages from some other member +\* due to network delays. On the other hand, real nodes can go down at any time. The absence of the +\* member's message doesn't mean that the member is out of the network, we never can be sure about +\* that, thus, this information is unreliable and can't be trusted during the consensus process. +\* What can be trusted is whether there's a Commit message from some member was received by the node. +MoreThanFNodesCommitted(r) == CountCommitted(r) > F + +\* PrepareRequestSentOrReceived denotes whether there's a PrepareRequest +\* message received from the current round's speaker (as the node r sees it). +PrepareRequestSentOrReceived(r) == [type |-> "PrepareRequest", rm |-> GetPrimary(rmState[r].view), view |-> rmState[r].view] \in msgs + +\* -------------- Safety temporal formula -------------- + +\* Init is the initial predicate initializing values at the start of every +\* behaviour. +Init == + /\ rmState = [r \in RM |-> [type |-> "initialized", view |-> 0]] + /\ msgs = {} + +\* RMSendPrepareRequest describes the primary node r broadcasting PrepareRequest. +RMSendPrepareRequest(r) == + /\ rmState[r].type = "initialized" + /\ IsPrimary(r) + /\ rmState' = [rmState EXCEPT ![r].type = "prepareSent"] + /\ msgs' = msgs \cup {[type |-> "PrepareRequest", rm |-> r, view |-> rmState[r].view]} + /\ UNCHANGED <<>> + +\* RMSendPrepareResponse describes non-primary node r receiving PrepareRequest from +\* the primary node of the current round (view) and broadcasting PrepareResponse. +\* This step assumes that PrepareRequest always contains valid transactions and +\* signatures. +RMSendPrepareResponse(r) == + /\ \/ rmState[r].type = "initialized" + \* We do allow the transition from the "cv" state to the "prepareSent" or "commitSent" stage + \* as it is done in the code-level dBFT implementation by checking the NotAcceptingPayloadsDueToViewChanging + \* condition (see + \* https://github.com/nspcc-dev/dbft/blob/31c1bbdc74f2faa32ec9025062e3a4e2ccfd4214/dbft.go#L419 + \* and + \* https://github.com/neo-project/neo-modules/blob/d00d90b9c27b3d0c3c57e9ca1f560a09975df241/src/DBFTPlugin/Consensus/ConsensusService.OnMessage.cs#L79). + \* However, we can't easily count the number of "lost" nodes in this specification to match precisely + \* the implementation. Moreover, we don't need it to be counted as the RMSendPrepareResponse enabling + \* condition specifies only the thing that may happen given some particular set of enabling conditions. + \* Thus, we've extended the NotAcceptingPayloadsDueToViewChanging condition to consider only MoreThanFNodesCommitted. + \* It should be noted that the logic of MoreThanFNodesCommittedOrLost can't be reliable in detecting lost nodes + \* (even with neo-project/neo#2057), because real nodes can go down at any time. See the comment above the MoreThanFNodesCommitted. + \/ /\ rmState[r].type = "cv" + /\ MoreThanFNodesCommitted(r) + /\ \neg IsPrimary(r) + /\ PrepareRequestSentOrReceived(r) + /\ rmState' = [rmState EXCEPT ![r].type = "prepareSent"] + /\ msgs' = msgs \cup {[type |-> "PrepareResponse", rm |-> r, view |-> rmState[r].view]} + /\ UNCHANGED <<>> + +\* RMSendCommit describes node r sending Commit if there's enough PrepareResponse +\* messages. +RMSendCommit(r) == + /\ \/ rmState[r].type = "prepareSent" + \* We do allow the transition from the "cv" state to the "prepareSent" or "commitSent" stage, + \* see the related comment inside the RMSendPrepareResponse definition. + \/ /\ rmState[r].type = "cv" + /\ MoreThanFNodesCommitted(r) + /\ Cardinality({ + msg \in msgs : /\ (msg.type = "PrepareResponse" \/ msg.type = "PrepareRequest") + /\ msg.view = rmState[r].view + }) >= M + /\ PrepareRequestSentOrReceived(r) + /\ rmState' = [rmState EXCEPT ![r].type = "commitSent"] + /\ msgs' = msgs \cup {[type |-> "Commit", rm |-> r, view |-> rmState[r].view]} + /\ UNCHANGED <<>> + +\* RMSendAck describes node r collecting enough Commit messages and sending +\* the Ack message. +RMSendAck(r) == + /\ rmState[r].type /= "bad" + /\ rmState[r].type /= "dead" + /\ rmState[r].type /= "ackSent" + /\ rmState[r].type /= "blockAccepted" + /\ PrepareRequestSentOrReceived(r) + /\ Cardinality({msg \in msgs : msg.type = "Commit" /\ msg.view = rmState[r].view}) >= M + /\ rmState' = [rmState EXCEPT ![r].type = "ackSent"] + /\ msgs' = msgs \cup {[type |-> "Ack", rm |-> r, view |-> rmState[r].view]} + /\ UNCHANGED <<>> + +\* RMAcceptBlock describes node r collecting enough Ack messages and accepting +\* the block. +RMAcceptBlock(r) == + /\ rmState[r].type = "ackSent" + /\ Cardinality({msg \in msgs : msg.type = "Ack" /\ msg.view = rmState[r].view}) >= M + /\ rmState' = [rmState EXCEPT ![r].type = "blockAccepted"] + /\ UNCHANGED <> + +\* RMSendChangeView describes node r sending ChangeView message on timeout. +RMSendChangeView(r) == + /\ \/ (rmState[r].type = "initialized" /\ \neg IsPrimary(r)) + \/ rmState[r].type = "prepareSent" + /\ LET cv == [type |-> "ChangeView", rm |-> r, view |-> rmState[r].view] + IN /\ cv \notin msgs + /\ rmState' = [rmState EXCEPT ![r].type = "cv"] + /\ msgs' = msgs \cup {[type |-> "ChangeView", rm |-> r, view |-> rmState[r].view]} + +\* RMReceiveChangeView describes node r receiving enough ChangeView messages for +\* view changing. +RMReceiveChangeView(r) == + /\ rmState[r].type /= "bad" + /\ rmState[r].type /= "dead" + /\ rmState[r].type /= "blockAccepted" + /\ rmState[r].type /= "commitSent" + /\ rmState[r].type /= "ackSent" + /\ Cardinality({ + rm \in RM : Cardinality({ + msg \in msgs : /\ msg.type = "ChangeView" + /\ msg.rm = rm + /\ GetNewView(msg.view) >= GetNewView(rmState[r].view) + }) /= 0 + }) >= M + /\ rmState' = [rmState EXCEPT ![r].type = "initialized", ![r].view = GetNewView(rmState[r].view)] + /\ UNCHANGED <> + +\* RMBeBad describes the faulty node r that will send any kind of consensus message starting +\* from the step it's gone wild. This step is enabled only when RMFault is non-empty set. +RMBeBad(r) == + /\ r \in RMFault + /\ Cardinality({rm \in RM : rmState[rm].type = "bad"}) < F + /\ rmState' = [rmState EXCEPT ![r].type = "bad"] + /\ UNCHANGED <> + +\* RMFaultySendCV describes sending CV message by the faulty node r. +RMFaultySendCV(r) == + /\ rmState[r].type = "bad" + /\ LET cv == [type |-> "ChangeView", rm |-> r, view |-> rmState[r].view] + IN /\ cv \notin msgs + /\ msgs' = msgs \cup {cv} + /\ UNCHANGED <> + +\* RMFaultyDoCV describes view changing by the faulty node r. +RMFaultyDoCV(r) == + /\ rmState[r].type = "bad" + /\ rmState' = [rmState EXCEPT ![r].view = GetNewView(rmState[r].view)] + /\ UNCHANGED <> + +\* RMFaultySendPReq describes sending PrepareRequest message by the primary faulty node r. +RMFaultySendPReq(r) == + /\ rmState[r].type = "bad" + /\ IsPrimary(r) + /\ LET pReq == [type |-> "PrepareRequest", rm |-> r, view |-> rmState[r].view] + IN /\ pReq \notin msgs + /\ msgs' = msgs \cup {pReq} + /\ UNCHANGED <> + +\* RMFaultySendPResp describes sending PrepareResponse message by the non-primary faulty node r. +RMFaultySendPResp(r) == + /\ rmState[r].type = "bad" + /\ \neg IsPrimary(r) + /\ LET pResp == [type |-> "PrepareResponse", rm |-> r, view |-> rmState[r].view] + IN /\ pResp \notin msgs + /\ msgs' = msgs \cup {pResp} + /\ UNCHANGED <> + +\* RMFaultySendCommit describes sending Commit message by the faulty node r. +RMFaultySendCommit(r) == + /\ rmState[r].type = "bad" + /\ LET commit == [type |-> "Commit", rm |-> r, view |-> rmState[r].view] + IN /\ commit \notin msgs + /\ msgs' = msgs \cup {commit} + /\ UNCHANGED <> + +\* RMFaultySendAck describes sending Ack message by the faulty node r. +RMFaultySendAck(r) == + /\ rmState[r].type = "bad" + /\ LET ack == [type |-> "Ack", rm |-> r, view |-> rmState[r].view] + IN /\ ack \notin msgs + /\ msgs' = msgs \cup {ack} + /\ UNCHANGED <> + +\* RMDie describes node r that was removed from the network at the particular step +\* of the behaviour. After this node r can't change its state and accept/send messages. +RMDie(r) == + /\ r \in RMDead + /\ Cardinality({rm \in RM : rmState[rm].type = "dead"}) < F + /\ rmState' = [rmState EXCEPT ![r].type = "dead"] + /\ UNCHANGED <> + +\* Terminating is an action that allows infinite stuttering to prevent deadlock on +\* behaviour termination. We consider termination to be valid if at least M nodes +\* has the block being accepted. +Terminating == + /\ Cardinality({rm \in RM : rmState[rm].type = "blockAccepted"}) >= M + /\ UNCHANGED <> + +\* TerminatingFourNodesDeadlock describes node r that is in the state where dBFT stucks in a four +\* nodes scenario. Allow infinite stuttering to prevent TLC deadlock recognition. +TerminatingFourNodesDeadlockSameView(r) == + /\ Cardinality({rm \in RM : rmState[rm].type = "dead" /\ rmState[rm].view = rmState[r].view}) = 1 + /\ Cardinality({rm \in RM : rmState[rm].type = "cv" /\ rmState[rm].view = rmState[r].view}) = 2 + /\ Cardinality({rm \in RM : rmState[rm].type = "commitSent" /\ rmState[rm].view = rmState[r].view}) = 1 + /\ UNCHANGED <> + +\* TerminatingFourNodesDeadlock describes node r that is in the state where dBFT stucks in a four +\* nodes scenario and the same view. Allow infinite stuttering to prevent TLC deadlock recognition. +TerminatingFourNodesDeadlock == + /\ Cardinality({rm \in RM : rmState[rm].type = "dead"}) = 1 + /\ Cardinality({rm \in RM : rmState[rm].type = "cv"}) = 2 + /\ Cardinality({rm \in RM : rmState[rm].type = "commitSent"}) = 1 + /\ UNCHANGED <> + +\* Next is the next-state action describing the transition from the current state +\* to the next state of the behaviour. +Next == + \/ Terminating + \/ \E r \in RM: + RMSendPrepareRequest(r) \/ RMSendPrepareResponse(r) \/ RMSendCommit(r) \/ RMSendAck(r) + \/ RMAcceptBlock(r) \/ RMSendChangeView(r) \/ RMReceiveChangeView(r) + \/ RMDie(r) \/ RMBeBad(r) + \/ RMFaultySendCV(r) \/ RMFaultyDoCV(r) \/ RMFaultySendCommit(r) \/ RMFaultySendAck(r) \/ RMFaultySendPReq(r) \/ RMFaultySendPResp(r) + \/ TerminatingFourNodesDeadlock +\* \/ TerminatingFourNodesDeadlockSameView(r) + +\* Safety is a temporal formula that describes the whole set of allowed +\* behaviours. It specifies only what the system MAY do (i.e. the set of +\* possible allowed behaviours for the system). It asserts only what may +\* happen; any behaviour that violates it does so at some point and +\* nothing past that point makes difference. +\* +\* E.g. this safety formula (applied standalone) allows the behaviour to end +\* with an infinite set of stuttering steps (those steps that DO NOT change +\* neither msgs nor rmState) and never reach the state where at least one +\* node is committed or accepted the block. +\* +\* To forbid such behaviours we must specify what the system MUST +\* do. It will be specified below with the help of fairness conditions in +\* the Fairness formula. +Safety == Init /\ [][Next]_vars + +\* -------------- Fairness temporal formula -------------- + +\* Fairness is a temporal assumptions under which the model is working. +\* Usually it specifies different kind of assumptions for each/some +\* subactions of the Next's state action, but the only think that bothers +\* us is preventing infinite stuttering at those steps where some of Next's +\* subactions are enabled. Thus, the only thing that we require from the +\* system is to keep take the steps until it's impossible to take them. +\* That's exactly how the weak fairness condition works: if some action +\* remains continuously enabled, it must eventually happen. +Fairness == WF_vars(Next) + +\* -------------- Specification -------------- + +\* The complete specification of the protocol written as a temporal formula. +Spec == Safety /\ Fairness + +\* -------------- Liveness temporal formula -------------- + +\* For every possible behaviour it's true that eventually (i.e. at least once +\* through the behaviour) block will be accepted. It is something that dBFT +\* must guarantee (an in practice this condition is violated). +TerminationRequirement == <>( + \/ Cardinality({r \in RM : rmState[r].type = "blockAccepted"}) >= M + \/ /\ Cardinality({rm \in RM : rmState[rm].type = "dead"}) = 1 + /\ Cardinality({rm \in RM : rmState[rm].type = "cv"}) = 2 + /\ Cardinality({rm \in RM : rmState[rm].type = "commitSent"}) = 1 +\* \/ \E r \in RM: /\ Cardinality({rm \in RM : rmState[rm].type = "dead" /\ rmState[rm].view = rmState[r].view}) = 1 +\* /\ Cardinality({rm \in RM : rmState[rm].type = "cv" /\ rmState[rm].view = rmState[r].view}) = 2 +\* /\ Cardinality({rm \in RM : rmState[rm].type = "commitSent" /\ rmState[rm].view = rmState[r].view}) = 1 + ) + +\* A liveness temporal formula asserts only what must happen (i.e. specifies +\* what the system MUST do). Any behaviour can NOT violate it at ANY point; +\* there's always the rest of the behaviour that can always make the liveness +\* formula true; if there's no such behaviour than the liveness formula is +\* violated. The liveness formula is supposed to be checked as a property +\* by the TLC model checker. +Liveness == TerminationRequirement + +\* -------------- ModelConstraints -------------- + +\* MaxViewConstraint is a state predicate restricting the number of possible +\* behaviour states. It is needed to reduce model checking time and prevent +\* the model graph size explosion. This formulae must be specified at the +\* "State constraint" section of the "Additional Spec Options" section inside +\* the model overview. +MaxViewConstraint == /\ \A r \in RM : rmState[r].view <= MaxView + /\ \A msg \in msgs : msg.view <= MaxView + +\* -------------- Invariants of the specification -------------- + +\* Model invariant is a state predicate (statement) that must be true for +\* every step of every reachable behaviour. Model invariant is supposed to +\* be checked as an Invariant by the TLC Model Checker. + +\* TypeOK is a type-correctness invariant. It states that all elements of +\* specification variables must have the proper type throughout the behaviour. +TypeOK == + /\ rmState \in [RM -> RMStates] + /\ msgs \subseteq Messages + +\* InvTwoBlocksAccepted states that there can't be two different blocks accepted in +\* the two different views, i.e. dBFT must not allow forks. +InvTwoBlocksAccepted == \A r1 \in RM: + \A r2 \in RM \ {r1}: + \/ rmState[r1].type /= "blockAccepted" + \/ rmState[r2].type /= "blockAccepted" + \/ rmState[r1].view = rmState[r2].view + +\* InvFaultNodesCount states that there can be F faulty or dead nodes at max. +InvFaultNodesCount == Cardinality({ + r \in RM : rmState[r].type = "bad" \/ rmState[r].type = "dead" + }) <= F + +\* This theorem asserts the truth of the temporal formula whose meaning is that +\* the state predicates TypeOK, InvTwoBlocksAccepted and InvFaultNodesCount are +\* the invariants of the specification Spec. This theorem is not supposed to be +\* checked by the TLC model checker, it's here for the reader's understanding of +\* the purpose of TypeOK, InvTwoBlocksAccepted and InvFaultNodesCount. +THEOREM Spec => [](TypeOK /\ InvTwoBlocksAccepted /\ InvFaultNodesCount) + +============================================================================= +\* Modification History +\* Last modified Thu Apr 25 20:03:18 MSK 2024 by anna +\* Last modified Mon Mar 06 15:36:57 MSK 2023 by root +\* Last modified Sat Jan 21 01:26:16 MSK 2023 by rik +\* Created Thu Dec 15 16:06:17 MSK 2022 by anna diff --git a/formal-models/dbft3.0/dbft___AllGoodModel.launch b/formal-models/dbft3.0/dbft___AllGoodModel.launch new file mode 100644 index 00000000..52b9984e --- /dev/null +++ b/formal-models/dbft3.0/dbft___AllGoodModel.launch @@ -0,0 +1,42 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +