-
Notifications
You must be signed in to change notification settings - Fork 392
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[#6366] improvement(hadoop, authorization): support getting location of schema for gcs hadoop catalog. #6367
Conversation
@@ -445,7 +447,7 @@ public static List<String> getMetadataObjectLocation( | |||
// The Hive default schema location is Hive warehouse directory | |||
String defaultSchemaLocation = | |||
getHiveDefaultLocation(ident.name(), catalogObj.name()); | |||
if (defaultSchemaLocation != null && !defaultSchemaLocation.isEmpty()) { | |||
if (StringUtils.isNotBlank(defaultSchemaLocation)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💯
@@ -462,6 +464,13 @@ public static List<String> getMetadataObjectLocation( | |||
if (defaultSchemaLocation != null && !defaultSchemaLocation.isEmpty()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change this as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure
if (Objects.equals(catalogProvider, "hive") | ||
|| Objects.equals(catalogProvider, "hadoop")) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why don't we just do string comparison?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's easy to get NPE for a new starter like stringVariable.equals("value")
if stringVariable
is a null value.
locations.add(defaultSchemaLocation); | ||
} | ||
} else if (catalogObj.provider().equals("hadoop")) { | ||
String catalogLocation = catalogObj.properties().get(HiveConstants.LOCATION); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little surprised that our core
module relies on catalog-common
module, which is not what I want, catalog-common
as its name, should always be used by catalog-* modules and other modules rely on catalog. But for core
module, the dependency relation is the opposite, we should fix this.
I saw it is introduced by @FANNG1 , so we really should fix it.
Besides, why a hadoop catalog needs to get a Hive property definition? It's weird to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Besides, why a hadoop catalog needs to get a Hive property definition? It's weird to me.
I just want to reuse the string constant location
, indeed it's not so elegant to use constants in HiveConstants, I will introduce a similar value in the HadoopConstants
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can have a common constant
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name for catalog-common
is a little confusing, by design, it contains the constants shared by catalogs and connectors and Iceberg REST server, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then, these constants should be moved to common
module. The misuse of the modules makes the dependency really not a tree any more, it's more like a graph, which is really not good. We shoud bear in mind to avoid such cyclic dependency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Besides, for catalog/connector, the dependency of catalog-common
is reasonable, because it is the downstream of the catalog-common
module. But core
module is the upstream of the catalog-common
model, so it is not a good implementation.
@@ -472,19 +480,20 @@ public static List<String> getMetadataObjectLocation( | |||
.catalogDispatcher() | |||
.loadCatalog( | |||
NameIdentifier.of(ident.namespace().level(0), ident.namespace().level(1))); | |||
LOG.info("Catalog provider is %s", catalogObj.provider()); | |||
if (catalogObj.provider().equals("hive")) { | |||
LOG.info("Catalog provider is {}", catalogObj.provider()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This LOG is useless, it is only for debug, better to remove it.
Is this PR duplicated with #6211 ? |
Close it temporarily as it's duplicated with #6211 |
What changes were proposed in this pull request?
support getting location of schema for gcs hadoop catalog.
Why are the changes needed?
it's an improvement.
Fix: #6366
Does this PR introduce any user-facing change?
N/A.
How was this patch tested?
UT