Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automerge: Keep message size below 4096 bytes to not exceed gossip broadcast limit #11

Open
adzialocha opened this issue Dec 16, 2024 · 5 comments
Labels
crdt Text-CRDT handling, data types, etc.

Comments

@adzialocha
Copy link
Member

adzialocha commented Dec 16, 2024

Automerge gives us fairly large payloads which break the default gossip message size limit by iroh-gossip (4092 bytes) even when using safe_incremental.

This especially happens if another peer edits a document for the first time which already contains some text, even if it's just adding one single character. It's also surprising that the 4kb limit is reached even if the text itself is less than ca. 100 characters long.

I believe we can fix this by understanding better how Automerge's "diffs" are made and possibily using more low-level methods of their library to really only send the actual changed text.

@adzialocha adzialocha added the bug Something isn't working label Dec 16, 2024
@adzialocha
Copy link
Member Author

adzialocha commented Dec 17, 2024

I've bumped the limit to some arbitrary number for now to work around this issue until we looked into Automerge: 69a0d98

@swick
Copy link
Collaborator

swick commented Dec 17, 2024

Can't you chop up the message internally somehow? Requiring us to send messages with a certain size to the "networking" part of the app seems a bit strange.

@adzialocha
Copy link
Member Author

adzialocha commented Dec 17, 2024

Can't you chop up the message internally somehow? Requiring us to send messages with a certain size to the "networking" part of the app seems a bit strange.

Yes, that makes sense if the payload is simply just large (like, very large text documents), but 4kb data should already be a lot of text and that it's caused by 100 characters smells more like a bug / wrong use of the Automerge library. So before we dig into optimizations like breaking the network messages apart I'd make sure we're using Automerge "correctly" first.

@adzialocha
Copy link
Member Author

adzialocha commented Dec 17, 2024

My theory is that automerge can't distinct between authors when calculating the diffs. It takes all the edits of the other peer and adds them into the one of the newly arriving peer, merging the whole edit history of peer A to the small edit (adding one character) of peer B and sending this as a bulk back to peer A.

@adzialocha
Copy link
Member Author

EtherSync applies their own diffing algorithm to identify which parts of the text have really been changed, I assume that's required as Automerge can't figure it out itself? https://github.com/ethersync/ethersync/blob/main/daemon%2Fsrc%2Fdocument.rs#L136-L155

@adzialocha adzialocha added crdt Text-CRDT handling, data types, etc. and removed bug Something isn't working labels Jan 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
crdt Text-CRDT handling, data types, etc.
Projects
None yet
Development

No branches or pull requests

2 participants