Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ADLSgen2 implementation guidance #1

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

ADLSgen2 implementation guidance #1

wants to merge 2 commits into from

Conversation

blueww
Copy link
Owner

@blueww blueww commented Jul 10, 2023

Add a doc for ADLSgen2 implementation guidance , to better coorporate with community on implement ADLSGen2 in Azurite.

@@ -0,0 +1,134 @@
### Background
Azurite is an open-source Azure Storage API compatible server (emulator). It currently supports the Blob, Queue, and Table services. We have received many customer asks on ADLSGen2 support in Azurite from many channels, include but not limited to [github issues](https://github.com/Azure/Azurite/issues/553), email, and requests from interested teams at Microsoft.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct "ADLSGen2" to "ADLS Gen2" or "Azure Data Lake Storage Gen2"

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will update.

### Background
Azurite is an open-source Azure Storage API compatible server (emulator). It currently supports the Blob, Queue, and Table services. We have received many customer asks on ADLSGen2 support in Azurite from many channels, include but not limited to [github issues](https://github.com/Azure/Azurite/issues/553), email, and requests from interested teams at Microsoft.

We have get 2 PRs ([PR1](https://github.com/Azure/Azurite/pull/1933), [PR2](https://github.com/Azure/Azurite/pull/1934)) submitted by the community , try to implement ADLSgen2 in Azurite. However, we can't merge them now since they might not meet our expectation and merge bar.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

submitted by the community to implement ADLSgen2 in Azurite. However, we are unable to merge them at this time since they do not meet our expectation and merge bar.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will update.


We have get 2 PRs ([PR1](https://github.com/Azure/Azurite/pull/1933), [PR2](https://github.com/Azure/Azurite/pull/1934)) submitted by the community , try to implement ADLSgen2 in Azurite. However, we can't merge them now since they might not meet our expectation and merge bar.

Azurite welcome contribution. To better coorporate with community on implement ADLSGen2 in Azurite, this document gives the details of the plan we suggest to implement ADLS Gen 2 in Azurite, and our expectations for community submissions that we can accept as PRs.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Azurite welcomes contributions. To better cooperate with the community on an implementation of ADLS Gen2 in Azurite

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will update

### ADLSGen2 Introduction
It's very important to understand ADLSgen2 feature before implementing it in Azurite.

[Azure Data Lake Storage Gen2](https://learn.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-introduction) (aka: AdlsGen2) is a set of capabilities dedicated to big data analytics and built on Azure Blob Storage.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove en-us/ culture from all URLs


[Azure Data Lake Storage Gen2](https://learn.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-introduction) (aka: AdlsGen2) is a set of capabilities dedicated to big data analytics and built on Azure Blob Storage.
#### FNS vs. HNS
A normal Azure storage account is with Flat namespace (FNS).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Storage accounts can configured with a flat namespace (FNS) or a hierarchical namespace (HNS). By default, storage accounts are FNS.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will update all links.

1. Don’t need to change data store structure.
2. Don’t need Azurite user to differ HNS/FNS account.

2. The change will add all dfs API interface to Azurite, which can help to support phase II.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

capitalize DFS

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will update all DFS.


2. The change will add all dfs API interface to Azurite, which can help to support phase II.

4. Code change should be split into several small PRs as following:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as follows:

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will update

### Background
Azurite is an open-source Azure Storage API compatible server (emulator). It currently supports the Blob, Queue, and Table services. We have received many customer asks on ADLSGen2 support in Azurite from many channels, include but not limited to [github issues](https://github.com/Azure/Azurite/issues/553), email, and requests from interested teams at Microsoft.

We have get 2 PRs ([PR1](https://github.com/Azure/Azurite/pull/1933), [PR2](https://github.com/Azure/Azurite/pull/1934)) submitted by the community , try to implement ADLSgen2 in Azurite. However, we can't merge them now since they might not meet our expectation and merge bar.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[minor]

Suggested change
We have get 2 PRs ([PR1](https://github.com/Azure/Azurite/pull/1933), [PR2](https://github.com/Azure/Azurite/pull/1934)) submitted by the community , try to implement ADLSgen2 in Azurite. However, we can't merge them now since they might not meet our expectation and merge bar.
We have get 2 PRs ([PR1](https://github.com/Azure/Azurite/pull/1933), [PR2](https://github.com/Azure/Azurite/pull/1934)) submitted by the community, try to implement ADLSgen2 in Azurite. However, we can't merge them now since they might not meet our expectations and merge bar.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will update.

2. 1 PR to add DFS endpoint
3. Several PRs to implement each dfs API (with credential handler), include testing

5. Need make sure each API behavior is aligned on rest API doc , also aligned with real Azure Server. See more in validation criteria.
Copy link

@XiaoningLiu XiaoningLiu Jul 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
5. Need make sure each API behavior is aligned on rest API doc , also aligned with real Azure Server. See more in validation criteria.
5. Need make sure each API behavior is aligned on rest API doc, also aligned with real Azure Storage services. See more in validation criteria.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will update.

1. E.g. .Net SDK need changes the blob/dfs Uri convert function in [this file](https://github.com/Azure/azure-sdk-for-net/blob/e8c40cc204b8cf750fcc820eab90d11f80612c3a/sdk/storage/Azure.Storage.Files.DataLake/src/DataLakeUriBuilder.cs#L275)

##### Phase II: implementation Blob/DFS on HNS account
1. Azurite user need configure each Azurite Account type as HNS/FNS when Azurite starts up.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
1. Azurite user need configure each Azurite Account type as HNS/FNS when Azurite starts up.
1. Azurite users need to configure each Azurite Account type as HNS/FNS when Azurite starts up.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will update.

##### Phase II: implementation Blob/DFS on HNS account
1. Azurite user need configure each Azurite Account type as HNS/FNS when Azurite starts up.
1. Need design how to input the config (default should be FNS)
2.How to handle it when user start Azurite with change account type? (Report error? )

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll suggest Azurite doesn't support FNS/HNS migration and account type change. The storage account type is finalized after creation. A wrong configuration (like wrong HNS/FNS type) should get error reported.

Suggested change
2.How to handle it when user start Azurite with change account type? (Report error? )
2. How to handle it when user start Azurite with change account type? (Report error? )

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add to Phase II 1.iii.

@blueww
Copy link
Owner Author

blueww commented Jul 11, 2023

@schoag-msft , @XiaoningLiu

Thanks for the review!
I have updated the PR per your review comments.
Would you please help to see if any further comments?

Besides that, I have added a draft wiki page for it: https://github.com/blueww/Azurite/wiki/ADLS-Gen2-Implementation-Guidance.md

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants