From 60620f59a699b2fdf5e901cbdb722361d3159dbd Mon Sep 17 00:00:00 2001
From: Christopher Haster
Date: Wed, 23 Oct 2024 15:09:45 -0500
Subject: [PATCH] Rewording more math - Berlekamp-Massey stuff
---
README.md | 152 ++++++++++++++++++++++++++++--------------------------
1 file changed, 79 insertions(+), 73 deletions(-)
diff --git a/README.md b/README.md
index 63d946e..eb27e97 100644
--- a/README.md
+++ b/README.md
@@ -243,33 +243,34 @@ syndromes $S_i$:
-The next step is figuring our the locations of our errors $X_j$.
+The next step is figuring out the error-locations, $X_j$.
-To help with this, we introduce another polynomial, the "error-locator
-polynomial" $\Lambda(x)$:
+To help with this, we introduce a very special polynomial, the
+"error-locator polynomial", $\Lambda(x)$:
This polynomial has some rather useful properties:
-1. For any $X_j$, $\Lambda(X_j^{-1}) = 0$.
+1. For any error-location, $X_j$, $\Lambda(X_j^{-1}) = 0$.
- This is for similar reasons why $P(g^i) = 0$. For any $X_j$:
+ This is for similar reasons why $P(g^i) = 0$. For any error-location
+ $X_j$:
@@ -282,29 +283,34 @@ This polynomial has some rather useful properties:
- This prevents trivial solutions and is what makes $\Lambda(x)$ useful.
+ This 1 prevents trivial solutions, and is what makes $\Lambda(x)$
+ useful.
-But what's _really_ interesting, is that these two properties allow us to
+But what's _really_ interesting is that these two properties allow us to
solve for $\Lambda(x)$ with only our syndromes $S_i$.
-First note that since $\Lambda(x)$ has $e$ roots, we can expand it into
-an $e$ degree polynomial. We also know that $\Lambda(0) = 1$, so the
-constant term must be 1. If we name the coefficients of this polynomial
-$\Lambda_k$, this gives us another definition of $\Lambda(x)$:
+We know $\Lambda(x)$ as $e$ roots, which means we can expand it into a
+polynomial with $e+1$ terms. We also know that $\Lambda(0) = 1$, so the
+constant term must be 1. Giving the coefficients of this expanded
+polynomial the arbitrary names
+$\Lambda_1, \Lambda_2, \cdots, \Lambda_e$, we end up with another
+definition of $\Lambda(x)$:
-Plugging in $X_j^{-1}$ should still evaluate to zero:
+Note this doesn't change our error-locator, $\Lambda(x)$, it still has
+all of its original properties. For example, plugging in $X_j^{-1}$
+should still evaluate to zero:
@@ -328,8 +334,8 @@ zero:
@@ -337,45 +343,45 @@ Wait a second...
-Aren't these our syndromes? $S_i$?
+Aren't these our syndromes? $S_i = \sum_{j \in E} Y_j X_j^i$?
-We can rearrange this into an equation for $S_i$ using only our
-coefficients and $e$ previously seen syndromes $S_{i-k}$:
+They are! We can rearrange this into an equation for $S_i$ using only our
+coefficients, $\Lambda_k$, and $e$ previously seen syndromes,
+$S_{i-1}, S_{i-2}, \cdots, S_{i-e}$:
-The only problem is this is one equation with $e$ unknowns, our
-coefficients $\Lambda_k$.
-
-But if we repeat this for $e$ syndromes, $S_{e}$ to $S_{n-1}$, we can
-build $e$ equations for $e$ unknowns, and create a system of equations
-that is solvable. This is why we need $n=2e$ syndromes/fixed-points to
-solve for $e$ errors:
+If we repeat this $e$ times, for syndromes
+$S_e, S_{e+1}, \cdots, S_{n-1}$, we end up with $e$ equations and
+$e$ unknowns. A system that is, in theory, solvable:
+This is where the $n=2e$ requirement comes from, and why we need $n=2e$
+syndromes to solve for $e$ errors at unknown locations.
+
#### Berlekamp-Massey
Ok that's the theory, but solving this system of equations efficiently is
@@ -383,10 +389,10 @@ still quite difficult.
Enter the Berlekamp-Massey algorithm.
-The key observation here by Massey, is that solving for $\Lambda(x)$ is
-equivalent to constructing an LFSR, that when given the initial sequence
-$S_0, S_1, \dots, S_{e-1}$, generates the sequence
-$S_e, S_{e+1}, \dots, S_{n-1}$:
+The key observation by Massey, is that solving for $\Lambda(x)$ is
+equivalent to constructing an LFSR that generates the sequence
+$S_e, S_{e+1}, \dots, S_{n-1}$, given the initial state
+$S_0, S_1, \dots, S_{e-1}$:
```
.---- + <- + <- + <- + <--- ... --- + <--.
@@ -394,26 +400,26 @@ $S_e, S_{e+1}, \dots, S_{n-1}$:
| *Λ1 *Λ2 *Λ3 *Λ4 ... *Λe-1 *Λe
| ^ ^ ^ ^ ^ ^
| .-|--.-|--.-|--.-|--.-- --.-|--.-|--.
-'-> |Se-1|Se-2|Se-3|Se-4| ... | S1 | S0 | -> Sn-1 Sn-2 Sn-3 Sn-4 ... S1 S0
+'-> |Se-1|Se-2|Se-3|Se-4| ... | S1 | S0 | -> Sn-1 Sn-2 ... S2+3 Se+2 Se+1 Se Se-1 Se-2 ... S3 S2 S1 S0
'----'----'----'----'-- --'----'----'
```
-We can in turn describe this LFSR as a [recurrence relation][recurrence-relation]
-like so:
+Such an LFSR can be described by a [recurrence relation][recurrence-relation]
+that probably looks a bit familiar:
Berlekamp-Massey relies on two key observations:
1. If an LFSR $L$ of size $|L|$ generates the sequence
- $s_0, s_1, \dots, s_{n-1}$, but failed to generate the sequence
- $s_0, s_1, \dots, s_{n-1}, s_n$, than an LFSR $L'$ that generates the
- sequence must have a size of at least:
+ $s_0, s_1, \dots, s_{n-1}$, but fails to generate the sequence
+ $s_0, s_1, \dots, s_{n-1}, s_n$, than an LFSR $L'$ that _does_
+ generate the sequence must have a size of at least:
@@ -439,28 +445,28 @@ Berlekamp-Massey relies on two key observations:
Multiplication is distributive, so we can move our summations around:
-
+
- Note the right summation looks a lot like $L$. If $L$ generates
+ And note that right summation looks a lot like $L$. If $L$ generates
$s_{n-|L'|}, s_{n-|L'|+1}, \cdots, s_{n-1}$, we can replace it with
$s_{n-k'}$:
-
+
@@ -473,7 +479,7 @@ Berlekamp-Massey relies on two key observations:
>
- So if $L'$ generates $s_n$, $L$ also generates $s_n$.
+ So if $L'$ generates $s_n$, $L$ must also generate $s_n$.
The only way to make $L'$ generate a different $s_n$ would be to make
$|L'| \ge n+1-|L|$ so that $L$ can no longer generate
@@ -507,9 +513,9 @@ Berlekamp-Massey relies on two key observations:
>
- Now, if we have a larger LFSR $L'$ with size $|L'| \gt |L|$, and we
- want to change only the symbol $s'_n$ by $d'$, we can just add
- $d' C(i)$ to it:
+ Now, if we have a larger LFSR $L'$ with size $|L'| \gt |L|$ and we
+ want to change only the symbol $s'_n$ by $d'$, we can add $d' C(i)$
+ and only $s'_n$ will be affected: