Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Signed dictionary #37

Open
jheretic opened this issue Jan 3, 2024 · 3 comments
Open

Signed dictionary #37

jheretic opened this issue Jan 3, 2024 · 3 comments

Comments

@jheretic
Copy link

jheretic commented Jan 3, 2024

I'm very interested in using bita for a software update project I'm working on, but it would be extremely useful if it were possible to be able to verify the release image with a cryptographic signature (probably PGP). My understanding is that simply signing the archive file wouldn't be very useful because you'd need to download the entire archive in order to verify the signature, defeating the point of the incremental downloads.

However, because the dictionary contains cryptographic hashes of all the chunks, I believe it would be sufficient to simply provide a signature to authenticate the dictionary, which would then validate the integrity of all the associated chunks. I would propose using the Sequoia library to generate a <archive filename>.sig file at compression time that contains a detached signature of the dictionary, and that bita/bitar would download and use that signature, if present, when fetching the archive in order to authenticate the dictionary.

Does that sound like a reasonable approach?

@oll3
Copy link
Owner

oll3 commented Jan 3, 2024

That's interesting idea and I have too thought about signing archives from time to time. And yes, I think signing the dictionary (or the full header, or the header checksum) should be a sufficient, and efficient, way of signing an archive.

Using a separate signature file would be one way to do it. My main objection to this approach is that it might give the impression that the signature file is for the full archive and could be validated with some external tool which do not understand the archive format, while it can't. This might just be me though.

Another approach would be to embed the signature with the archive (probably appending it after the header checksum).
I think this would be preferable to me. For one, not having to keep signature files around. And I think this might be easier to integrate with the bitar library. And in the name of silly optimizations it would potentially result in one less http request. And it should be possible to do in a backwards compatible way.

What do you think, are there any obvious drawbacks with the "embed" approach compared to the separate file approach, as you see it?

I would propose using the Sequoia

sequoia-openpgp would be the lib to use? I haven't used it before but I guess it would do the work. Possibly a bit disturbing that it's depending on openssl and has quite a heavy list of other dependencies too.

I don't have any alternative to propose atm, but I think I would prefer if it was a lighter lib and without external c libraries. Mainly since small is nice and no external C libs makes building and deploying on different targets simpler. Do you have any alternatives at hand?

Other than that I would like to see this feature added and hope we can find a good way forward!

@caesay
Copy link
Contributor

caesay commented Nov 2, 2024

I would also benefit from this feature. Happy to take a stab at a PR? Agree that the signature should be embedded, it is very commonly done that way and means the signature can't be separated or removed as easily.

Given this header structure:

//! Archive header structure.
//!
//! | Offset | Size | Description                                                         |
//! |--------|------|---------------------------------------------------------------------|
//! |      0 |    6 | Archive file magic (BITA1\0).                                       |
//! |      6 |    8 | Dictionary size (u64 le).                                           |
//! |     14 |    n | Protobuf encoded dictionary.                                        |
//! |      n |    8 | Chunk data offset in archive, absolute from archive start (u64 le). |
//! |  n + 8 |   64 | Full header checksum (blake2), from offset 0 to n + 8.              |

I think signing the header contents totally (including the checksum) and then putting the signature after the checksum should suffice? If I'm reading this correctly, the "chunk data offset" points to the chunk data. therefore older versions of bita can just skip/ignore this new section. Is that right?

I am not a cryptographic expert, but I have heard that Ed25519 is significantly faster and lighter than PGP, RSA, etc while also being much more secure at smaller key sizes. Additionally, there are a number of small, pure-rust implementations of Ed25519 which would limit dependency bloat.

Do you agree with the approach? If so, I can start on this soon.

@oll3
Copy link
Owner

oll3 commented Nov 2, 2024

I think signing the header contents totally (including the checksum) and then putting the signature after the checksum should suffice? If I'm reading this correctly, the "chunk data offset" points to the chunk data. therefore older versions of bita can just skip/ignore this new section. Is that right?

Yes, this is how I think it should be done too. And yes, this would allow for an older version of bita to clone a signed archive ignoring the signature.

I haven't thought this through fully but since there could be space between the chunk data and the dictionary while the dictionary/archive is not signed we might need some marker so that bita can tell if there is a signature contained or not. Or maybe a flag in the dictionary is enough to give that information. Preferably flags or 'extra header size' should've been outside the dictionary in the early header, but this is obviously not possible with the layout I created at the time. Oh well.

Anyway, I'm thinking a signature layout could look something like:
|extra header type/magic (like SIG\0|signature algorithm (eg Ed25519\0)|signature length(u32 le)|signature bytes|

But any input on this is appreciated!

I am not a cryptographic expert, but I have heard that Ed25519 is significantly faster and lighter than PGP, RSA, etc while also being much more secure at smaller key sizes. Additionally, there are a number of small, pure-rust implementations of Ed25519 which would limit dependency bloat.

Do you agree with the approach? If so, I can start on this soon.

I agree and I think Ed25519 seems like a good option.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants