Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generation of Snapshot Summaries #724

Open
Fokko opened this issue Nov 27, 2024 · 2 comments
Open

Generation of Snapshot Summaries #724

Fokko opened this issue Nov 27, 2024 · 2 comments
Assignees

Comments

@Fokko
Copy link
Contributor

Fokko commented Nov 27, 2024

With each snapshot comes a summary map, optional in V1, required in V2 and later:

image

The summary contains information such as what kind of files the snapshot contains (data/delete), and what the changes are in rows and bytes. The best way to replicate this metrics collection is by looking at the Java code: https://github.com/apache/iceberg/blob/main/core/src/main/java/org/apache/iceberg/SnapshotSummary.java

This class works closely alongside the SnapshotProducer and tracks what happens with the snapshot.

@Fokko Fokko mentioned this issue Nov 27, 2024
28 tasks
@barronw
Copy link
Contributor

barronw commented Nov 29, 2024

Can I pick this up?

@c-thiel
Copy link
Collaborator

c-thiel commented Nov 30, 2024

@barronw gladly! Assigned the issue to you :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants