[#6366] improvement(hadoop, authorization): support getting location of schema for gcs hadoop catalog. #6367

yuqi1129 · 2025-01-24T13:09:26Z

What changes were proposed in this pull request?

support getting location of schema for gcs hadoop catalog.

Why are the changes needed?

it's an improvement.

Fix: #6366

Does this PR introduce any user-facing change?

N/A.

How was this patch tested?

UT

tengqm · 2025-01-26T06:49:50Z

core/src/main/java/org/apache/gravitino/authorization/AuthorizationUtils.java

@@ -445,7 +447,7 @@ public static List<String> getMetadataObjectLocation(
                        // The Hive default schema location is Hive warehouse directory
                        String defaultSchemaLocation =
                            getHiveDefaultLocation(ident.name(), catalogObj.name());
-                        if (defaultSchemaLocation != null && !defaultSchemaLocation.isEmpty()) {
+                        if (StringUtils.isNotBlank(defaultSchemaLocation)) {


tengqm · 2025-01-26T06:50:28Z

core/src/main/java/org/apache/gravitino/authorization/AuthorizationUtils.java

@@ -462,6 +464,13 @@ public static List<String> getMetadataObjectLocation(
              if (defaultSchemaLocation != null && !defaultSchemaLocation.isEmpty()) {


Change this as well?

tengqm · 2025-01-26T06:53:22Z

core/src/main/java/org/apache/gravitino/authorization/AuthorizationUtils.java

+            if (Objects.equals(catalogProvider, "hive")
+                || Objects.equals(catalogProvider, "hadoop")) {


Why don't we just do string comparison?

It's easy to get NPE for a new starter like stringVariable.equals("value") if stringVariable is a null value.

jerryshao · 2025-01-26T09:31:13Z

core/src/main/java/org/apache/gravitino/authorization/AuthorizationUtils.java

                locations.add(defaultSchemaLocation);
              }
+            } else if (catalogObj.provider().equals("hadoop")) {
+              String catalogLocation = catalogObj.properties().get(HiveConstants.LOCATION);


I'm a little surprised that our core module relies on catalog-common module, which is not what I want, catalog-common as its name, should always be used by catalog-* modules and other modules rely on catalog. But for core module, the dependency relation is the opposite, we should fix this.

I saw it is introduced by @FANNG1 , so we really should fix it.

Besides, why a hadoop catalog needs to get a Hive property definition? It's weird to me.

Besides, why a hadoop catalog needs to get a Hive property definition? It's weird to me.

I just want to reuse the string constant location, indeed it's not so elegant to use constants in HiveConstants, I will introduce a similar value in the HadoopConstants

We can have a common constant

The name for catalog-common is a little confusing, by design, it contains the constants shared by catalogs and connectors and Iceberg REST server, etc.

Then, these constants should be moved to common module. The misuse of the modules makes the dependency really not a tree any more, it's more like a graph, which is really not good. We shoud bear in mind to avoid such cyclic dependency.

Besides, for catalog/connector, the dependency of catalog-common is reasonable, because it is the downstream of the catalog-common module. But core module is the upstream of the catalog-common model, so it is not a good implementation.

jerryshao · 2025-01-26T09:34:42Z

core/src/main/java/org/apache/gravitino/authorization/AuthorizationUtils.java

@@ -472,19 +480,20 @@ public static List<String> getMetadataObjectLocation(
                    .catalogDispatcher()
                    .loadCatalog(
                        NameIdentifier.of(ident.namespace().level(0), ident.namespace().level(1)));
-            LOG.info("Catalog provider is %s", catalogObj.provider());
-            if (catalogObj.provider().equals("hive")) {
+            LOG.info("Catalog provider is {}", catalogObj.provider());


This LOG is useless, it is only for debug, better to remove it.

jerqi · 2025-01-26T09:44:54Z

Is this PR duplicated with #6211 ?

Add other changes.

yuqi1129 · 2025-01-26T09:47:33Z

Is this PR duplicated with #6211 ?

Indeed, I checked and it duplicated with #6211, Let's wait for it to merge and test authorization logic for GCS fileset.

yuqi1129 · 2025-01-27T03:04:43Z

Close it temporarily as it's duplicated with #6211

fix

c7968f6

yuqi1129 requested review from xunliu and jerqi and removed request for xunliu January 24, 2025 13:09

yuqi1129 self-assigned this Jan 24, 2025

fix again

a79464a

tengqm reviewed Jan 26, 2025

View reviewed changes

Resovle comments

0739a44

jerqi previously approved these changes Jan 26, 2025

View reviewed changes

jerryshao reviewed Jan 26, 2025

View reviewed changes

yuqi1129 closed this Jan 27, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[#6366] improvement(hadoop, authorization): support getting location of schema for gcs hadoop catalog. #6367

[#6366] improvement(hadoop, authorization): support getting location of schema for gcs hadoop catalog. #6367

yuqi1129 commented Jan 24, 2025

tengqm Jan 26, 2025

tengqm Jan 26, 2025

yuqi1129 Jan 26, 2025

tengqm Jan 26, 2025

yuqi1129 Jan 26, 2025 •

edited

Loading

jerryshao Jan 26, 2025

yuqi1129 Jan 26, 2025

jerryshao Jan 26, 2025

FANNG1 Jan 26, 2025

jerryshao Jan 26, 2025 •

edited

Loading

jerryshao Jan 26, 2025

jerryshao Jan 26, 2025

jerqi commented Jan 26, 2025

yuqi1129 commented Jan 26, 2025

yuqi1129 commented Jan 27, 2025

		@@ -462,6 +464,13 @@ public static List<String> getMetadataObjectLocation(
		if (defaultSchemaLocation != null && !defaultSchemaLocation.isEmpty()) {

		if (Objects.equals(catalogProvider, "hive")
		\|\| Objects.equals(catalogProvider, "hadoop")) {

[#6366] improvement(hadoop, authorization): support getting location of schema for gcs hadoop catalog. #6367

[#6366] improvement(hadoop, authorization): support getting location of schema for gcs hadoop catalog. #6367

Conversation

yuqi1129 commented Jan 24, 2025

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yuqi1129 Jan 26, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jerryshao Jan 26, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jerqi commented Jan 26, 2025

yuqi1129 commented Jan 26, 2025

yuqi1129 commented Jan 27, 2025

yuqi1129 Jan 26, 2025 •

edited

Loading

jerryshao Jan 26, 2025 •

edited

Loading