You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am trying to optimize a query and noticed that the HTTP stats in EXPLAIN ANALYZE statements seem to be off. I query one Parquet file with 10.79 GiB and the HTTP stats mention reading 32.7 GiB. I am wondering whether http_state_policy.cpp could be over-counting total_bytes_received, and in particular including values from the content-length HTTP header of HEAD requests.
SET azure_transport_option_type = curl;
SET azure_http_stats = True;
SET threads =1;
SET azure_read_transfer_concurrency =1;
SET azure_read_transfer_chunk_size =1024*1024;
SET azure_read_buffer_size =1024*1024;
EXPLAIN ANALYZE SELECT col1 FROM'az://<snip>.blob.core.windows.net/<snip>.parquet'LIMIT1
In the Azure SDK logs I see 3 HEAD requests with content-length : 11583653237 and 349 GET requests with content-length : 1048576. So the total input data should be around 0.34 GiB instead of 32.7 GiB.
If this analysis is correct, I can send a small PR to fix.
The text was updated successfully, but these errors were encountered:
I am trying to optimize a query and noticed that the HTTP stats in
EXPLAIN ANALYZE
statements seem to be off. I query one Parquet file with 10.79 GiB and the HTTP stats mention reading 32.7 GiB. I am wondering whether http_state_policy.cpp could be over-countingtotal_bytes_received
, and in particular including values from thecontent-length
HTTP header of HEAD requests.In the Azure SDK logs I see 3 HEAD requests with
content-length : 11583653237
and 349 GET requests withcontent-length : 1048576
. So the total input data should be around 0.34 GiB instead of 32.7 GiB.If this analysis is correct, I can send a small PR to fix.
The text was updated successfully, but these errors were encountered: