Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SAS URLs for authentication? #76

Closed
kmatt opened this issue Sep 10, 2024 · 6 comments
Closed

SAS URLs for authentication? #76

kmatt opened this issue Sep 10, 2024 · 6 comments

Comments

@kmatt
Copy link

kmatt commented Sep 10, 2024

DuckDB v1.1.0 CLI

I am able to create a secret and access Parquet files in a storage account with connection string (which contains the account key), but not a SAS URL [1] which can be used to add restrictions or a lifetime.

Attempts to use the SAS URL as a CONNECTION_STRING result in A invalid connection string has been provided. or URL using bad/illegal format or missing URL.

If SAS URLs are supported, which form of authentication secret should be used? https://duckdb.org/docs/extensions/azure.html#authentication

[1] https://learn.microsoft.com/en-us/rest/api/storageservices/delegate-access-with-shared-access-signature

@kmatt
Copy link
Author

kmatt commented Sep 10, 2024

Related to #67 ?

  • Full path to a single Parquet file works, using a wildcard fails with 404 (The specified blob does not exist.)

  • Creating a secret with the account key (not a SAS token) works with a wildcard. Account keys unfortunately provide full access to the storage account at the root directory.

@fdcastel
Copy link

fdcastel commented Sep 12, 2024

I have the exact same problem.

The current documentation is vague about how to properly authenticate using a SAS token.

#67 implies that we should pass the SAS token as the connection string:

CREATE SECRET secret1 (
    TYPE AZURE,
    CONNECTION_STRING 'sp=rcwdl&st=2024-03-20T19:53:11Z&se=2026-09-30T23:56:13Z&spr=https&sv=2022-11-02&sr=c&sig=XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX'
)

However, trying to run a simple query over a parquet file:

SELECT 
    count(*)
FROM 
    read_parquet('abfss://MY_STORAGE.dfs.core.windows.net/MY_CONTAINER/MY_FOLDER/MY_FILE.parquet');

gives me the following error:

SQL Error: java.sql.SQLException: Invalid Input Error: A invalid connection string has been provided.

So, what is the correct way to read from an Azure endpoint using a SAS token as the authentication method?

Using DuckDB v1.1.0 via DBeaver 24.2.0.202409011551

@fdcastel
Copy link

Update: I got it working with:

CREATE OR REPLACE SECRET secret1 (
    TYPE AZURE,
    CONNECTION_STRING 'AccountName=MY_STORAGE;SharedAccessSignature=MY_SAS_TOKEN'
);

This worked with both forms:

SELECT count(*)
FROM 'az://MY_STORAGE.blob.core.windows.net/MY_CONTAINER/MY_FOLDER/MY_FILE.parquet';

and

SELECT count(*)
FROM read_parquet('abfss://MY_STORAGE.dfs.core.windows.net/MY_CONTAINER/MY_FOLDER/MY_FILE.parquet');

@kmatt
Copy link
Author

kmatt commented Sep 13, 2024

@fdcastel Did you determine that from documentation (link?) or through trial and error?

@fdcastel
Copy link

Both.

The official docs are here.

However, unfortunately, they don't show exactly this scenario (using just AccountName and SharedAccessSignature).

@kmatt
Copy link
Author

kmatt commented Sep 17, 2024

Solved in #76 (comment)

@kmatt kmatt closed this as completed Sep 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants