Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
kgo: return decompression errors while consuming
Kafka can return partial batches, so decompression errors are common. If I ask for at most 100 bytes, and the broker has two 60 byte batches, I will receive one valid 60 byte batch and then a partial 40 byte batch. The second partial batch will fail at decompressing. This is the reason I previously never returned decompression errors. However, if a client truly does produce somewhat-valid compressed data that *some* decompressors can process, but *others* (Go's) cannot, then the first batch received could fail to decompress. The client would fail processing, return an empty batch, and try consuming at the same spot. The client would spin loop trying to consume and the end user would never be aware. Now, if the first error received is a decompression error, we bubble it up to the end user. This is hard to test internally, so this was hack manually tested. Scenario one: * I changed the code to ignore crc errors, since that just got in the way * I ran a local kfake where the first five bytes of a RecordBatch.Records was overwritten with "aaaaa" * I consumed _before_ this patch -- the client spin-looped, never progressing and never printing anything. * I consumed _after_ this patch -- the client immediately received the error. Scenario two: * Same crc ignoring * I ran a local kfake where, when consuming, all batches AFTER the first had their RecordBatch.Records overwritten with "aaaaa". * I consumed before and after this patch -- in both cases, the client progressed to the end of the partition and no errors were printed. * To double verify the decompression error was being encountered, I added a println in kgo where the decompression error is generated -- the println was always encountered. Closes #854.
- Loading branch information