Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: transaction information comes base64 encoded #1354

Closed
elboletaire opened this issue Jul 17, 2024 · 11 comments · Fixed by #1393
Closed

bug: transaction information comes base64 encoded #1354

elboletaire opened this issue Jul 17, 2024 · 11 comments · Fixed by #1393
Assignees
Labels
bug Something isn't working

Comments

@elboletaire
Copy link
Member

elboletaire commented Jul 17, 2024

Describe the bug
The transaction information endpoint v2/chain/transactions/{blockHeight}/{transactionIndex} is returning most information base64 encoded.

Due to this base64 encoding, the explorer was trying to decode everything, and that was causing the error in vocdoni/explorer/issues/62

To Reproduce

curl -s https://api-stg.vocdoni.net/v2/chain/transactions/1176690/0 | jq

Calls where we found base64 data instead of base16 (expected):

Current behavior
The returned json has many base64 values:

{
  "tx": {
    "setProcess": {
      "txtype": "SET_PROCESS_CENSUS",
      "nonce": 1,
      "processId": "sx3/YYFNq5DWw6m2XWyQgwSA5Lda0y50eUICAAAAAAA=",
      "censusRoot": "zUU9BcTLBCnuXuGu/tAW9VO4AmtM7VsMNSkFv6U8foE=",
      "censusURI": "ipfs://bafybeicyfukarcryrvy5oe37ligmxwf55sbfiojori4t25wencma4ymxfa",
      "censusSize": "1000"
    }
  },
  "txInfo": {
    "transactionNumber": 21778,
    "transactionHash": "5fe6780f87d24e990e653552f03861ba50d09857261299606ae0d272b171f843",
    "blockHeight": 1176690,
    "transactionIndex": 0,
    "transactionType": "setProcess"
  },
  "signature": "6ae51220aec68b503ff5dea3c2af6b375b8c6a41816c94c1bdae70617face7796a8fdbb78128279aa71bd892363e51883925595ee92f3977953a6bf44a4e8e671c"
}

Expected behavior
To receive all these base64 values as base16

@p4u
Copy link
Member

p4u commented Aug 28, 2024

ping @mvdan

@mvdan
Copy link
Contributor

mvdan commented Aug 28, 2024

Sorry for the slowness. I looked into this recently but got stuck; these are Go types generated as part of our protobuf models, so they hard-code the field types and we are not able to use our own HexBytes type which overrides the JSON marshaling from base64 to base16.

Copying the types to replace those field types would be a nightmare, since the tx type is a oneof, so we would need to copy over a dozen types.

Protobuf themselves do not provide any way to configure how the JSON marshaling happens, because the spec says that bytes MUST marshal as base64 in JSON: https://protobuf.dev/programming-guides/proto3/#json

Trying to do some patching via encoding/json and any types is going to be super unreliable, because it's basically impossible to tell for sure if a string like "deadbeef" is meant to be base64 or not. So that's not an option.

The only realistic solution I can think of would be to use Go reflection to recurse the models.Tx value, find all []byte fields, encode as both JSON base64 and base16, and after doing the protojson marshal, do a search-and-replace of all base64 strings from before with their base16 counterparts. This would be very hacky, but I think it would work.

The only alternative I can think of is to keep base64, and very clearly document that the tx field in this endpoint will contain the protojson encoding of the models protobuf types, and so the encoding will be slightly different than the rest of our API. For instance, with bytes being base64 rather than base16. This would save us from horrible hacks like the above in the backend, but I'm not sure if it would lead to other hacks in the frontend. I hope it would be easy enough for the frontend to remember that this endpoint uses base64 rather than base16.

In the meantime, here are two commits to clean up the code a bit and add a test: #1374

@mvdan
Copy link
Contributor

mvdan commented Aug 28, 2024

A third option would be for the REST endpoint to provide a JSON field containing the original protobuf-encoded bytes, in binary format encoded as base64. Then it would be up to the frontend to use the protobuf library to decode those bytes into the models.Tx type. I think this would be clean for both the backend and frontend, although it would make the REST endpoint rather ugly, as we would end up with the tx field having a relatively long blob of base64.

@p4u
Copy link
Member

p4u commented Sep 3, 2024

ping @altergui

@mvdan
Copy link
Contributor

mvdan commented Sep 11, 2024

Let me know what the resolution here is, I'm happy to assist on the Go side if any changes are needed.

@altergui
Copy link
Contributor

The only alternative I can think of is to keep base64, and very clearly document that the tx field in this endpoint will contain the protojson encoding of the models protobuf types, and so the encoding will be slightly different than the rest of our API. For instance, with bytes being base64 rather than base16. This would save us from horrible hacks like the above in the backend, but I'm not sure if it would lead to other hacks in the frontend. I hope it would be easy enough for the frontend to remember that this endpoint uses base64 rather than base16.

that's not an option because it's basically what we've been doing until now, the problem is that not ALL fields are base64 encoded, some are simply integers (like 1000) that nonetheless can be parsed as base64, but they shouldn't. so it's not simply saying tx field has base64 encoding, but exactly specifying which fields are (processID and censusRoot in this case), for EACH transaction type. doable, but cumbersome and hard to maintain.

@altergui
Copy link
Contributor

The only realistic solution I can think of would be to use Go reflection to recurse the models.Tx value, find all []byte fields, encode as both JSON base64 and base16, and after doing the protojson marshal, do a search-and-replace of all base64 strings from before with their base16 counterparts. This would be very hacky, but I think it would work.

that's clever! i tried using reflection to modify the types of the original object (before passing to protojson marshal) but that didn't go well. didn't think about leaving the object untouched, love it.

@mvdan
Copy link
Contributor

mvdan commented Sep 12, 2024

Then I think the best remaining option is to do some reflect magic on the Go side to find all []byte values and re-encode them from base64 to hex.

that's not an option because it's basically what we've been doing until now, the problem is that not ALL fields are base64 encoded, some are simply integers (like 1000) that nonetheless can be parsed as base64, but they shouldn't. so it's not simply saying tx field has base64 encoding, but exactly specifying which fields are (processID and censusRoot in this case), for EACH transaction type. doable, but cumbersome and hard to maintain.

To be clear, that is a problem with either base64 or hex encoding. {"foo": "123"} can be a string, or base64-encoded bytes, or hex-encoded bytes, and any consumer of the JSON cannot tell them apart unless they know which fields are meant to be bytes. So I'm actually not sure why switching from base64 to hex helps here, aside from consistency with the other JSON endpoints.

@p4u
Copy link
Member

p4u commented Sep 16, 2024

ping @altergui @selankon

@altergui
Copy link
Contributor

altergui commented Sep 16, 2024

So I'm actually not sure why switching from base64 to hex helps here, aside from consistency with the other JSON endpoints.

precisely because of that consistency. yes, please do the reflect magic 🙏

fwiw, i agree "123" can be a string or hex-encoded bytes, you're correct. but explorer doesn't care, the hex-encoded bytes are useful, they can be treated as a string. problem is with base64, it's useless unless converted to base16, and that conversion is problematic.

@mvdan
Copy link
Contributor

mvdan commented Sep 16, 2024

Done: #1393

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants