Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: list_tables method in glue catalog now only return tables. #1258

Merged
merged 2 commits into from
Oct 31, 2024

Conversation

omkenge
Copy link
Contributor

@omkenge omkenge commented Oct 28, 2024

Description:

This pull request addresses an issue where the list_tables function was returning views alongside Iceberg tables. Since views lack the table_type property or have it set differently than "iceberg", they were incorrectly included in the list of tables. This change modifies the filtering logic to ensure that only actual Iceberg tables are returned.

Problem:

In the existing list_tables function, both Iceberg tables and Glue views were being returned. Glue views generally do not contain the table_type parameter or may have it unset, which causes them to be mistakenly included when listing Iceberg tables in a namespace. This behavior makes it challenging for users to retrieve only Iceberg tables as intended.

Solution:

The __is_iceberg_table helper method has been modified to:
Check if the table_type parameter is present in each table’s properties.
Ensure that the table_type parameter is explicitly set to "iceberg".

Testing:

Tests were added to verify that:
Only Iceberg tables are returned by list_tables.
Glue views, which lack the table_type parameter or have it set to a non-Iceberg value, are excluded from the list.

@omkenge
Copy link
Contributor Author

omkenge commented Oct 28, 2024

Hi Team , @kevinjqliu @sungwy
While working I faced this issue

Error Details:

When views are included in the output of list_tables and subsequently processed, the absence of the table_type parameter often raises an error similar to:
NoSuchPropertyException: Property 'table_type' missing, could not determine type
This PR resolves this error by ensuring that only tables with table_type set to "iceberg" are included in the final list.
Plz Check IT

@Fokko
Copy link
Contributor

Fokko commented Oct 30, 2024

Hey @omkenge thanks for pointing this out. I was unaware of this issue. Are you interested in providing a patch to fix this?

@omkenge
Copy link
Contributor Author

omkenge commented Oct 30, 2024

Hi @Fokko
I fix this issue .. now list_tables method only return iceberg tables...
This issue I saw with AWS glue catalog ...
In AWS does not provide table_type for views .. so we can differentiate it on this basis
The __is_iceberg_table helper method has been modified to:
Check if the table_type parameter is present in each table’s properties.
Ensure that the table_type parameter is explicitly set to "iceberg".

Copy link
Contributor

@HonahX HonahX left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@omkenge Thanks for reporting the issue and providing the fix!

NoSuchPropertyException: Property 'table_type' missing, could not determine type

I was able to reproduce this issue and verify that this PR can fix it.

Let's get this into 0.8.0!

pyiceberg/catalog/glue.py Outdated Show resolved Hide resolved
@omkenge
Copy link
Contributor Author

omkenge commented Oct 31, 2024

HI @HonahX
Changes are in ...

@HonahX HonahX added this to the PyIceberg 0.8.0 release milestone Oct 31, 2024
@HonahX HonahX merged commit d5b51f9 into apache:main Oct 31, 2024
7 checks passed
@kevinjqliu
Copy link
Contributor

Thanks for the contribution @omkenge !

@omkenge omkenge deleted the improvement/list_tables branch October 31, 2024 18:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants