Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggest new encoding scheme #2686

Closed
antonlerner opened this issue Aug 17, 2021 · 7 comments
Closed

Suggest new encoding scheme #2686

antonlerner opened this issue Aug 17, 2021 · 7 comments

Comments

@antonlerner
Copy link
Contributor

Description

Currently, XDR codec scheme is implemented in a way that is very CPU and memory intensive, causing nodes to allocate a lot of memory and eventually crash.
We want to explore more efficient codecs .
Codecs must meet the following pre requisites:

  1. non ambiguity: every implementation of the spec will give the same encoding
  2. deterministic: same struct will be serialised the same way at all times
  3. cross platform support: since we want to allow implementation in many languages, codec library must be supported by different languages
  4. robust and "battle tested"

In addition to the "must haves" we want our chosen library to be CPU and memory efficient.

@dshulyak
Copy link
Contributor

dshulyak commented Sep 7, 2021

So far I have reviewed several options:

  1. ssz

It is a relatively simple non-self describing binary encoding, designed to replace RLP in ethereum2.
Fixed length items are encoded in the order of fields, with integers as a little endian.
Dynamic length items are written at the offset after all fixed-length items.

Example:

type Example struct {
    Field1 uint32
    Field2 [20]byte
    Field3 []byte // dynamic byte array
    Field4 [32]byte
}

Encoding is a concantenation from: little_endian(field1) || field2 || little_endian(offset(4+20+4+32)) || field4 || field3.
Offset is computed from fixed-length fields, offsets, and length for dynamic fields that precede other dynamic fields.

This is an optimization to support resource-restricted environments so that they can decode any field without allocating memory for the whole struct. The primary target is an EVM, so we may not need it. The pros of this optimization are questionable for go-spacemesh, as SVM (which may have potentially have used this optimization) declares its own encoding.

Besides, basic types (fixed/dynamic byte slices, uint8-64, bool, structs) ssz supports:

  • bitlist, bitvector (packing multiple boolean variables as bits efficiently)
  • union type

Another part of ssz specifies a particular merkleization scheme together with an encoding, so that every object in the protocol can be hashed accordingly.

There are libraries in many languages (go, rust, java, nim, python, js) that are actively supported because they are used in eth2 clients. I have used this one https://github.com/ferranbt/fastssz , it is efficient and well tested but doesn't implement the streaming interface and misses some minor things mostly related to style that we will need.

  1. deterministic cbor

It generally satisfies the properties that we are looking for. But as it is self-describing (includes tags in a manner similar to json) the length of the encoding is somewhat longer. So I decided not to pursue this option.

  1. @tal-m suggested to take a look at https://borsh.io/

It shares a lot of similarities with ssz (excluding merkleization which we won't use).

The most significant difference is that the offset is not used to encode dynamic length fields. So if we consider my example above, encoding with borsch will look like: little_endian(field1) || field2 || little_endian(len(field3)) || field3 || field4.
The resulting length of the encoding is the same, but doesn't support lookup without decoding the whole structure.

Besides that, there are a few minor differences:

  • in borsh map types are supported, must be lexicographically sorted. we don't use them in existing data structures and probably won't need too
  • ssz doesn't support float64/32. we don't use them
  • better encoding for null values (Option type in borsh, Union[null, Struct] in ssz)

Libraries are implemented by near org: https://github.com/orgs/near/repositories?q=borsh&type=&language=&sort=
In particular golang library uses reflection and won't be efficient, but we can write an efficient one easily as the encoding is straightforward.

@tal-m
Copy link
Contributor

tal-m commented Sep 9, 2021

Thanks for the summary! That's very helpful.

Ideally, I think we would be using the same serialization library for any kind of serializing we do in our code, so if there's a single library that can support both our crypto requirements and SVM use-cases (including potential future use-cases) that is an advantage. For example, support for maps might come in under that category...

Have you looked at Cap'n Proto? It seems that the latest versions define a Canonicalized encoding that would make it a viable option (unlike protobufs or flatbuffers, which explictly say they're not deterministic).

@dshulyak
Copy link
Contributor

dshulyak commented Sep 9, 2021

Ideally, I think we would be using the same serialization library for any kind of serializing we do in our code, so if there's a single library that can support both our crypto requirements and SVM use-cases (including potential future use-cases) that is an advantage. For example, support for maps might come in under that category...

I think that it would be good if we can use consistent encoding everywhere if we assume that multiple spacemesh clients may exist in the future. Based on available SMIPs SVM can use any of ssz or borsh, but according to @YaronWittenstein current SVM encoding is more compact. Maybe we can use it in go-spacemesh too.

Have you looked at Cap'n Proto? It seems that the latest versions define a Canonicalized encoding that would make it a viable option (unlike protobufs or flatbuffers, which explictly say they're not deterministic).

In the previous project we considered using it and decided not to use it:

  • it makes it much harder to work with types. instead of language based setters/getters programmer has to use API defined by the generated capnproto type. Pretty much all code needs to be refactored to work using new API
  • at least in golang capnproto doesn't seem to be zero-overhead based on this comparison. i don't know if it is an implementation thing or not. and in general zero overhead promise is negated if data is compressed for the wire.

So, it introduces additional complexity without obvious benefits.

For spacemesh use case, simple binary protocol seems to be a better choice (anything similar to ssz, borsh as they are both very straightforward) as it will be more error-proof, and needs to be concerned only with some security considerations from capnproto.

@dshulyak
Copy link
Contributor

After some thoughts, i am leaning towards borsh:

  • i am not completely satisfied with fastssz library, and writing new library will be more complex in comparison with borsh
  • ssz was designed specifically for eth2 and may evolve in the future in some unexpected way (due to merkleization requirements for example), borsh on the other hand seems to be simpler and finished spec.

i made a lib for golang https://github.com/dshulyak/borsh , it provides a streaming interface and doesn't rely on reflection. for now it defines support for types that we currently use in the go-sm codebase.

@dshulyak
Copy link
Contributor

dshulyak commented Oct 11, 2021

One more option is to stick with xdr, and rewrite the library using the code generation approach. I haven't looked into xdr details before, but it is actually not that different from borsh (xdr uses big endian instead of little endian for integers, but in general encoding schema is the same).

UPD: actually xdr seems to have more overhead in comparison with borsh based on spacemeshos/SMIPS#23

@nkryuchkov
Copy link
Contributor

nkryuchkov commented Dec 27, 2021

The XDR library consumes an excessive amount of memory on parsing of some data: #3014

@nkryuchkov
Copy link
Contributor

#2701 may be caused by a bug in the XDR library, I have no other ideas (needs to be confirmed)

@lrettig lrettig assigned dshulyak and unassigned lrettig Jan 12, 2022
@dshulyak dshulyak removed their assignment Jan 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants