Parquet V3 predicate pushdown to S3/Azure object storage ? #3047
Replies: 2 comments
-
So we do implement predicate pushdown but it is unrelated to our backend storage and works in all versions of Parquet. The engine makes a fetch request here: Line 37 in 0542bc3 Each element of the AST is responsible for building its own conditions. Here is where the binary operator will create conditions and push them down to the fetch layer: tempo/pkg/traceql/ast_conditions.go Line 34 in 0542bc3 In the fetch layer if an operation is present we will build an iterator that asserts the operation as close to the data as possible. If no operation is present we return the data all the way back to the engine and let it decide how to handle the situation. tempo/tempodb/encoding/vparquet3/block_traceql.go Line 1478 in 0542bc3 |
Beta Was this translation helpful? Give feedback.
-
Thank you for the insight , I actually wanted to link to an article about Searching further It may not necessarily always be better as the issue opened In ClickHouse more than 2 years ago with no interest: You can close the discussion. |
Beta Was this translation helpful? Give feedback.
-
Both S3 and azure blob storage support a features like
Predicate Pushdown
AndProjection pushdown
which drastically reduces the ammount of data to be read from parquet files. http://peter-hoffmann.com/2020/understand-predicate-pushdown-on-rowgroup-level-in-parquet-with-pyarrow-and-python.html .Are these predicate pushdowns considered ? How hard would It be to implement in tempo ?
Beta Was this translation helpful? Give feedback.
All reactions