You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I noticed some unexpected behavior when using globbing on an ADLSGen2 (Azure Data Lake Storage Gen2; Azure Storage Account).
ADLSGen2 supports blob and dfs endpoint.
According to documentation, globbing should be supported for both.
However, I noticed some queries that work for blob but not for the dfs endpoint:
endpoint
executable?
statement
blob (az)
OK
SELECT * FROM glob('az://some_folder/DAY=2024-09-20/HOUR=9/foo')
blob (az)
OK
SELECT * FROM glob('az://some_folder/DAY=2024-09-20/HOUR=9/*.json')
blob (az)
OK
SELECT * FROM glob('az://some_folder/DAY=2024-09-20/HOUR=9/*')
blob (az)
OK
SELECT * FROM glob('az://some_folder/DAY=2024-09-20/HOUR=9/**')
blob (az)
OK
SELECT * FROM glob('az://some_folder/DAY=2024-09-20/**')
dfs (abfss)
OK
SELECT * FROM glob('abfss://some_folder/DAY=2024-09-20/HOUR=9/foo.json')
dfs (abfss)
ERROR
SELECT * FROM glob('abfss://some_folder/DAY=2024-09-20/HOUR=9/*.json')
dfs (abfss)
ERROR
SELECT * FROM glob('abfss://some_folder/DAY=2024-09-20/HOUR=9/*')
dfs (abfss)
ERROR
SELECT * FROM glob('abfss://some_folder/DAY=2024-09-20/HOUR=9/**')
dfs (abfss)
OK
SELECT * FROM glob('abfss://some_folder/DAY=2024-09-20/**')
ERROR: SQL Error: java.sql.SQLException: Invalid Error: 404 The specified path does not exist. The specified path does not exist.
The text was updated successfully, but these errors were encountered:
After starting duckdb from the commandline and setting up secrets, I get the following errors when using abfss, but not when using az protocol:
D SELECT * FROM 'abfss://<container>/orders/**/*.parquet' limit 10;
Not implemented Error: abfss do not manage recursive lookup patterns, daily-vessel-consumption/**/*.parquet is therefor illegal, only pattern ending by ** are allowed.
But if I try that, I get:
SELECT * FROM 'abfss://<container>/orders/**' limit 10;
Catalog Error: Table with name abfss://<container>/orders/** does not exist!
Did you mean "pg_tablespace"?
LINE 1: SELECT * FROM 'abfss://<container>/orders/...
I should be running the latest release version and am running on ubuntu linux.
$ duckdb --version
v1.1.2 f680b7d08f
$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 20.04 LTS
Release: 20.04
Codename: focal
I noticed some unexpected behavior when using globbing on an ADLSGen2 (Azure Data Lake Storage Gen2; Azure Storage Account).
ADLSGen2 supports blob and dfs endpoint.
According to documentation, globbing should be supported for both.
However, I noticed some queries that work for
blob
but not for thedfs
endpoint:az
)SELECT * FROM glob('az://some_folder/DAY=2024-09-20/HOUR=9/foo')
az
)SELECT * FROM glob('az://some_folder/DAY=2024-09-20/HOUR=9/*.json')
az
)SELECT * FROM glob('az://some_folder/DAY=2024-09-20/HOUR=9/*')
az
)SELECT * FROM glob('az://some_folder/DAY=2024-09-20/HOUR=9/**')
az
)SELECT * FROM glob('az://some_folder/DAY=2024-09-20/**')
abfss
)SELECT * FROM glob('abfss://some_folder/DAY=2024-09-20/HOUR=9/foo.json')
abfss
)SELECT * FROM glob('abfss://some_folder/DAY=2024-09-20/HOUR=9/*.json')
abfss
)SELECT * FROM glob('abfss://some_folder/DAY=2024-09-20/HOUR=9/*')
abfss
)SELECT * FROM glob('abfss://some_folder/DAY=2024-09-20/HOUR=9/**')
abfss
)SELECT * FROM glob('abfss://some_folder/DAY=2024-09-20/**')
ERROR:
SQL Error: java.sql.SQLException: Invalid Error: 404 The specified path does not exist. The specified path does not exist.
The text was updated successfully, but these errors were encountered: