-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dekaf: Implement DeletionMode
to allow representing deletions as a Kafka header instead of a tombstone
#1738
Conversation
1ab8273
to
cb3fd03
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍
e49688a
to
cb3fd03
Compare
…Kafka header instead of a tombstone
972aecd
to
b391c01
Compare
…_deleted` for deletions Also needed to add this field to the generated Avro schemas
b391c01
to
e337f39
Compare
crates/dekaf/src/read.rs
Outdated
if matches!(self.deletes, DeletionMode::CDC) { | ||
let mut heap_node = HeapNode::from_node(root.get(), &self.alloc); | ||
let foo = DELETION_INDICATOR_PTR | ||
.create_heap_node(&mut heap_node, &self.alloc) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there's a memory leak here -- you're never resetting the bump allocator.
I would recommend not having it as a field of self
and instead creating one on the stack. Use it, and then call reset() on it immediately after each avro::encode so that it's only retaining a single HeapNode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was going to ask you to take a look at this as I wasn't sure I was doing it correctly. That makes sense to me, thanks 👍
This adds support for customizable deletion handling:
DeleteMode::Default
, which is what you get if you don't specify a deletion mode, operates the same was as before where deletes are mapped into Kafka records with a key, and null value.DeleteMode::Header
changes this behavior to include an_is_deleted
header on every returned Record. Non-deletes get0
, deletes get1
. In addition, the full deletion tombstone stored in the Flow collection is returned for the Record's value instead of null.I tested this out in Tinybird and it looks like this works as expected: setting a username of
{"deletions": "header"}
causes Flow deletion tombstones (op: d
) to cause the corresponding row to get deleted in Tinybird.Question: Should we just go all the way and name this
DeleteMode::Tinybird
? Or does someone have a better suggestion thanHeader
?Also included is a small change to
Read
to make data preview UIs work reliably again. I realized they stopped working consistently after merging #1733, and this fixes that.This change is