Table of Contents generated with DocToc
- Safe: A log that is safe if it has been replicated to a quorum, no matter whether or not the committed flag is set on any replica.
After a replica established leadership, it needs to merge the latest logs from a quorum of replicas, to ensure that the leader replica has all safe logs.
In TRaft, a leader only needs to use the logs from replicas with the max committer((term, id)
).
A log that does not present in any max-committer replicas a leader seen can not be safe.
Proof:
If a log A
becomes safe, it will be seen by the next leader Li
.
Because a leader has to collect logs from a quorum and any two quorum
intersections with each other.
If the next next leader Lj
(j>i
) has seen Li
, it will choose A
.
Otherwise, it will see a replica that has A
.
∴ Any newer leader will choose A
.
∴ TRaft only need to merge logs from replicas with the latest Committer
.
E.g.:
Li: indicates a replica becomes leader
A/B: a log is written
R0 L1 A
R1 A
R2 A
R3 L2 B
R4 L3
------------------------> time
In this digram:
L2 sees A
and then updates its Committer
to L2
, then writes log B
to its
local log. And R3 crashed before forwarding any log out.
-
If L3 established its leadership by contacting R2 and R3, it uses logs from only R3(R3 has the latest committer
L2
), it will seeA
. -
If L3 established its leadership by contacting R0 and R1, it will also see
A
.
都说deadline 是第一生产力. 放假没事, 试试看能不能7天写个raft出来, 当练手了. 2021-02-09 装逼有风险, 万一写不出来也别笑话我.
7天内直播: https://live.bilibili.com/22732677?visit_id=6eoy1n42a1w0
-
Leader election with less conflict. Raft has the issue that in a term, if multiple candidates try to elect theirselves, the conflicts are unacceptable. Because in one term, A voter will only vote for at most one leader.
In our impl,
term
is less strict: A voter is allowed to vote another GREATER leader. -
Out of order commit out of order apply.
We use a bitmap to describe what logs interfere each other.
-
Adopt a more generalized member change algo.
Get rid of single-step member change. Because any of the CORRECT single-step member change algo is equivalent to joint-consensus.
But joint-consensus is a specialization of the TODO (I dont have good name for it yet).
- Leader election
- WAL log
- snapshot: impl with https://github.com/openacid/slim , a static kv-like storage engine supporting protobuf.
- member change with generalized joint consensus.
- Out of order commit/apply if possible.
-
2021-02-10: LOC: +1008 -1
buildMajorityQuorums() import TailBitmap, add Union() TailBitmap.Clone() TailBitmap.Diff()
-
2021-02-11: LOC: +754 -44
refactor buildMajorityQuorums() LeaderId.Cmp() add NewLeaderId add NewBallot() fix: LeaderId.Cmp() accept nil as operand Ballot.CmpLog() to compare only the log-related fields rename Ballot.Accepted to Ballot.AcceptedFrom TailBitmap.Len() refactor design, add ReplicaStatus, remove Ballot. impl Vote Handler, under dev!!! use gogoproto, build clean .pb.go with less code, add serveCluster() to setup a simple cluster for test add NewCmdI64() add NewRecord() add ClusterConfig.GetReplicaInfo() NewTailBitmap() accepts extra bits to set
-
2021-02-12: LOC: +2601 -70
add LeaderId:Clone() refactor TailBitmap.Clone(): use proto draft test of vote rename By to Author, AcceptedFrom to Committer. borrowed concepts from git:DDD add test: HandleVoteReq. add ShortStr() to Cmd, LeaderI, Record and []Record refactor VoteReply: do not use ReplicaStatus to describe log status add readme to record progress readme: impl storage with slim test that voter send back nil log update readme: collect git log TailBitmap: accept second operand to be nil
-
2021-02-13: LOC: +1309 -183
test Replicate, under dev rename Node.Log to Node.Logs add Cmd.Intefering() to check if two commands not be allowed to change execution order add Record.Interfering() update readme NewTailBitmap() support non-64-aligned offset add AddLog() for leader to propose a command use map to store cluster members instead of slice add ClusterConfig.IsQuorum() to check if a set of members is a quorum TRaft.VoteOnce() run one round voting, to establish leadership TRaft actor loop and voting loop
-
2021-02-14: LOC: +891 -229
mainloop as an actor is the only one update traft data. under dev. not passed yet test VoteLoop granted leader merges collected logs after vote, leader merge responded logs add API Propose without replication
-
2021-02-15: LOC: +476 -197
add interface toCmd() to build Cmd from string test Replicate() add TRaft.sleep(): sleep only if it is not stopped.
-
2021-02-16: LOC: +1297 -592
refactor: rename Replicate to LogForward refactor: vote_test: remove unused type, funcs gruop daily log format by date