We’ve created a lot of classes thus far, including PrivateKey
, S256Point
, and Signature
.
We now need to start thinking about how to transmit these objects to other computers on the network, or even to disk.
This is where serialization comes into play.
We want to communicate or store a S256Point
or a Signature
or a PrivateKey
.
Ideally, we want to do this efficiently, for reasons we’ll see in [chapter_networking].
We’ll start with the S256Point
class, which is the public key class.
Recall that the public key in elliptic curve cryptography is really a coordinate in the form of (x,y).
How can we serialize this data?
It turns out there’s already a standard for serializing ECDSA public keys, called Standards for Efficient Cryptography (SEC)—and as the word "Efficient" in the name suggests, it has minimal overhead. There are two forms of SEC format that we need to be concerned with: uncompressed and compressed. We’ll begin with the former, and look at the compressed format in the next section.
Here is how the uncompressed SEC format for a given point P = (x,y) is generated:
-
Start with the prefix byte, which is
0x04
. -
Next, append the x coordinate in 32 bytes as a big-endian integer.
-
Next, append the y coordinate in 32 bytes as a big-endian integer.
The uncompressed SEC format is shown in Uncompressed SEC format.
Note
|
Big- and Little-Endian
The motivation for big- and little-endian encodings is storing a number on disk. A number under 256 is easy enough to encode, as a single byte (28) is enough to hold it. When it’s bigger than 256, how do we serialize the number to bytes? Arabic numerals are read left to right. A number like 123 is 100 + 20 + 3 and not 1 + 20 + 300. This is what we call big-endian, because the "big end" starts first. Computers can sometimes be more efficient using the opposite order, or little-endian—that is, starting with the little end first. Since computers work in bytes, which have 8 bits, we have to think in base 256.
This means that a number like 500 looks like Unfortunately, some serializations in Bitcoin (like the SEC format x and y coordinates) are big-endian, while others (like the transaction version number in [chapter_tx_parsing]) are little-endian. This book will let you know which ones are big- versus little-endian. |
Creating the uncompressed SEC format serialization is pretty straightforward. The trickiest part is converting a 256-bit number into 32 bytes, big-endian. Here’s how this is done in code:
class S256Point(Point):
...
def sec(self):
'''returns the binary version of the SEC format'''
return b'\x04' + self.x.num.to_bytes(32, 'big') \
+ self.y.num.to_bytes(32, 'big') # (1)
-
In Python 3, you can convert a number to bytes using the
to_bytes
method. The first argument is how many bytes it should take up and the second argument is the endianness (see the preceding note).
Recall that for any x coordinate, there are at most two y coordinates due to the y2 term in the elliptic curve equation (The two possible values for y are where this vertical line intersects the curve).
It turns out that even over a finite field, we have the same symmetry.
This is because for any (x,y) that satisfies y2 = x3 + ax + b, (x,–y) also satisfies the equation. Furthermore, in a finite field, –y % p = (p – y) % p. Or, more accurately, if (x,y) satisfies the elliptic curve equation, (x,p – y) also satisfies the equation. These are the only two solutions for a given x, as shown, so if we know x, we know the y coordinate has to be either y or p – y.
Since p is a prime number greater than 2, we know that p is odd. Thus, if y is even, p – y (odd minus even) will be odd. If y is odd, p – y will be even. In other words, between y and p – y, exactly one will be even and one will be odd. This is something we can use to our advantage to shorten the uncompressed SEC format: we can provide the x coordinate and the evenness of the y coordinate. We call this the compressed SEC format because of how the y coordinate is compressed into a single byte (namely, whether it’s even or odd).
Here is the serialization of the compressed SEC format for a given point P = (x,y):
-
Start with the prefix byte. If y is even, it’s
0x02
; otherwise, it’s0x03
. -
Next, append the x coordinate in 32 bytes as a big-endian integer.
The compressed SEC format is shown in Compressed SEC format.
Again, the procedure is pretty straightforward.
We can update the sec
method to handle compressed SEC keys:
class S256Point(Point):
...
link:code-ch04/ecc.py[role=include]
The big advantage of the compressed SEC format is that it only takes up 33 bytes instead of 65 bytes. This is a big savings when amortized over millions of transactions.
At this point, you may be wondering how you can analytically calculate y given the x coordinate. This requires us to calculate a square root in a finite field.
Stated mathematically:
- Find w such that w2 = v when we know v.
It turns out that if the finite field prime p % 4 = 3, we can do this rather easily. Here’s how.
First, we know:
- p % 4 = 3
which implies:
- (p + 1) % 4 = 0
That is, (p + 1)/4 is an integer.
By definition:
- w2 = v
We are looking for a formula to calculate w. From Fermat’s little theorem:
- wp–1 % p = 1
which means:
- w2 = w2 ⋅ 1 = w2 ⋅ wp–1 = w(p+1)
Since p is odd (recall p is prime), we know we can divide (p+1) by 2 and still get an integer, implying:
- w = w(p+1)/2
Now we can use (p+1)/4 being an integer this way:
- w = w(p+1)/2 = w2(p+1)/4 = (w2)(p+1)/4 = v(p+1)/4
So our formula for finding the square root becomes:
- if w2 = v and p % 4 = 3, w = v(p+1)/4
It turns out that the p used in secp256k1 is such that p % 4 == 3, so we can use this formula:
- w2 = v
- w = v(p+1)/4
That will be one of the two possible w's; the other will be p – w. This is due to taking the square root means that both the positive and negative will work.
We can add this as a general method in the S256Field class:
class S256Field(FieldElement):
...
link:code-ch04/ecc.py[role=include]
When we get a serialized SEC pubkey, we can write a parse
method to figure out which y we need:
class S256Point:
...
link:code-ch04/ecc.py[role=include]
-
The uncompressed SEC format is pretty straightforward.
-
The evenness of the y coordinate is given in the first byte.
-
We take the square root of the right side of the elliptic curve equation to get y.
-
We determine evenness and return the correct point.
Another class that we need to learn to serialize is Signature
.
Much like the SEC format, it needs to encode two different numbers, r
and s
.
Unfortunately, unlike S256Point
, Signature
cannot be compressed as s
cannot be derived solely from r
.
The standard for serializing signatures (and lots of other things, for that matter) is called Distinguished Encoding Rules (DER) format. DER format was used by Satoshi to serialize signatures. This was most likely because the standard was already defined in 2008, was supported in the OpenSSL library (used in Bitcoin at the time), and was easy enough to adopt, rather than creating a new standard.
DER signature format is defined like this:
-
Start with the
0x30
byte. -
Encode the length of the rest of the signature (usually
0x44
or0x45
) and append. -
Append the marker byte,
0x02
. -
Encode
r
as a big-endian integer, but prepend it with the0x00
byte ifr’s first byte ≥ `0x80
. Prepend the resulting length tor
. Add this to the result. -
Append the marker byte,
0x02
. -
Encode
s
as a big-endian integer, but prepend with the0x00
byte ifs’s first byte ≥ `0x80
. Prepend the resulting length tos
. Add this to the result.
The rules for #4 and #6 with the first byte starting with something greater than or equal to 0x80
are because DER is a general encoding and allows for negative numbers to be encoded.
The first bit being 1 means that the number is negative.
All numbers in an ECDSA signature are positive, so we have to prepend with 0x00
if the first bit is 1, which is equivalent to first byte ≥ 0x80
.
The DER format is shown in DER format.
Because we know r
is a 256-bit integer, r
will be at most 32 bytes expressed as big-endian.
It’s also possible the first byte could be ≥ 0x80, so #4 can be at most 33 bytes.
However, if r
is a relatively small number, it could be less than 32 bytes.
The same goes for s
and #6.
Here’s how this is coded in Python:
class Signature:
...
link:code-ch04/ecc.py[role=include]
-
In Python 3, you can convert a list of numbers to the byte equivalents using
bytes([some_integer1, some_integer2])
.
Overall, this is an inefficient way to encode r
and s
as there are at least 6 bytes that aren’t strictly necessary.
In the early days of Bitcoin, bitcoins were assigned to public keys specified in SEC format (uncompressed) and then were redeemed using DER signatures. For reasons we’ll get to in [chapter_script], using this particular very simple script turned out to be both wasteful for storing unspent transaction outputs (UTXOs) and a little less secure than the scripts in more prominent use now. For now, we’ll go through what addresses are and how they are encoded.
In order for Alice to pay Bob, she has to know where to send the money. This is true not just in Bitcoin, but for any method of payment. Since Bitcoin is a digital bearer instrument, the address can be something like a public key in a public key cryptography scheme. Unfortunately, SEC format, especially uncompressed, is a bit long (65 or 33 bytes). Furthermore, the 65 or 33 bytes are in binary format—not something that’s easy to read, at least raw.
There are three major considerations. The first is that the public key be readable (easy to hand-write and not too difficult to mistake, say, over the phone). The second is that it’s short (not so long that it’s cumbersome). The third is that it’s secure (so it’s harder to make mistakes).
So how do we get readability, compression, and security? If we express the SEC format in hexadecimal (4 bits per character), it’s double the length (130 or 66 characters). Can we do better?
We can use something like Base64, which can express 6 bits per character. This results in 87 characters for uncompressed SEC and 44 characters for compressed SEC.
Unfortunately, Base64 is prone to mistakes, as a lot of letters and numbers look similar (0
and O
, l
and I
, -
and _
).
If we remove these characters, we can achieve a result that has good readability and decent compression (around 5.86 bits per character).
Lastly, we can add a checksum at the end to ensure that mistakes are easy to detect.
This construction is called Base58. Instead of hexadecimal (base 16) or Base64, we’re encoding numbers in Base58.
The actual mechanics of doing the Base58 encoding are as follows.
All numbers, uppercase letters, and lowercase letters are utilized, except for the aforementioned 0/O
and l/I
.
That leaves us with 10 + 26 + 26 – 4 = 58.
Each of these characters represents a digit in Base58.
We can encode with a function that does exactly this:
link:code-ch04/helper.py[role=include]
...
link:code-ch04/helper.py[role=include]
-
The purpose of this loop is to determine how many of the bytes at the front are 0 bytes. We want to add them back at the end.
-
This is the loop that figures out what Base58 digit to use.
-
Finally, we prepend all the zeros that we counted at the front, because otherwise they wouldn’t show up as prefixed ones. This annoyingly happens with pay-to-pubkey-hash (p2pkh); more on that in [chapter_script].
This function will take any bytes in Python 3 and convert them to Base58.
Note
|
Why Base58 Is on the Way Out
Base58 has been used for a long time, and while it does make it somewhat easier than something like Base64 to communicate, it’s not really that convenient. Most people prefer to copy and paste the addresses, and if you’ve ever tried to communicate a Base58 address vocally, you know it can be a nightmare. What’s much better is the new Bech32 standard, which is defined in BIP0173.
Bech32 uses a 32-character alphabet that’s just numbers and lowercase letters, except |
The 264 bits from compressed SEC format are still a bit too long, not to mention a bit less secure (see [chapter_script]). To both shorten the address and increase security, we can use the ripemd160 hash.
By not using the SEC format directly, we can go from 33 bytes to 20 bytes, shortening the address significantly. Here is how a Bitcoin address is created:
-
For mainnet addresses, start with the prefix
0x00
, for testnet0x6f
. -
Take the SEC format (compressed or uncompressed) and do a sha256 operation followed by the ripemd160 hash operation, the combination of which is called a hash160 operation.
-
Combine the prefix from #1 and resulting hash from #2.
-
Do a hash256 of the result from #3 and get the first 4 bytes.
-
Take the combination of #3 and #4 and encode it in Base58.
The result of step 4 of this process is called the checksum. We can do steps 4 and 5 in one go this way:
link:code-ch04/helper.py[role=include]
Note
|
What Is Testnet?
Testnet is a parallel Bitcoin network that’s meant to be used by developers. The coins on there are not worth anything and the proof-of-work required to find a block is relatively easy. The mainnet chain as of this writing has around 550,000 blocks, while testnet has significantly more (around 1,450,000 blocks). |
We can implement the hash160 operation in helper.py:
link:code-ch04/helper.py[role=include]
-
Note that
hashlib.sha256(s).digest
does the sha256 and the wrapper around it does the ripemd160.
We can also update S256Point
with hash160
and address
methods:
class S256Point:
...
link:code-ch04/ecc.py[role=include]
The private key in our case is a 256-bit number. Generally, we are not going to need to serialize our secret that often, as it doesn’t get broadcast (that would be a bad idea!). That said, there are instances where you may want to transfer your private key from one wallet to another—for example, from a paper wallet to a software wallet.
For this purpose, you can use Wallet Import Format (WIF). WIF is a serialization of the private key that’s meant to be human-readable. WIF uses the same Base58 encoding that addresses use.
Here is how the WIF format is created:
-
For mainnet private keys, start with the prefix
0x80
, for testnet0xef
. -
Encode the secret in 32-byte big-endian.
-
If the SEC format used for the public key address was compressed, add a suffix of
0x01
. -
Combine the prefix from #1, serialized secret from #2, and suffix from #3.
-
Do a hash256 of the result from #4 and get the first 4 bytes.
-
Take the combination of #4 and #5 and encode it in Base58.
We can now create the wif
method on the PrivateKey
class:
class PrivateKey
...
link:code-ch04/ecc.py[role=include]
It will be very useful to know how big- and little-endian are done in Python, as the next few chapters will be parsing and serializing numbers to and from big-/little-endian quite a bit. In particular, Satoshi used a lot of little-endian for Bitcoin and unfortunately, there’s no easy-to-learn rule for where little-endian is used and where big-endian is used. Recall that SEC format uses big-endian encoding, as do addresses and WIF. From [chapter_tx_parsing] onward, we will use little-endian encoding a lot more. For this reason, we turn to the next two exercises. The last exercise of this section is to create a testnet address for yourself.
Go to a testnet faucet and send some testnet coins to that address (it should start with m
or n
, or else something is wrong).
If you succeeded, congrats!
You’re now the proud owner of some testnet coins!