Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Geoparquet column 'geometry' does not have geometry types #423

Closed
jutiss opened this issue Oct 4, 2024 · 5 comments · Fixed by duckdb/duckdb#14297
Closed

Geoparquet column 'geometry' does not have geometry types #423

jutiss opened this issue Oct 4, 2024 · 5 comments · Fixed by duckdb/duckdb#14297

Comments

@jutiss
Copy link

jutiss commented Oct 4, 2024

After upgrading to 1.1.0, it is no longer possible to open old parquet files created with WKB geometry.

Invalid Input Error: Geoparquet column 'geometry' does not have geometry types

The error occurs even when we do not select geometry columns. I assume that DuckDB is trying to automatically convert geometry columns to GEOMETRY type. Is there any way to disable this? Or maybe you can recommend some other way to handle this?

@carlopi
Copy link
Contributor

carlopi commented Oct 4, 2024

This looks to me connected to: duckdb/duckdb-node#124.

A very rough workaround is suggested, but we aim at solving this at the duckdb library level.

@Maxxen
Copy link
Member

Maxxen commented Oct 4, 2024

How was the GeoParquet file created? This error occurs because DuckDB detects that the file is a geoparquet file based on the presence of geoparquet key-value metadata in the footer, but the geometry_types REQUIRED field is missing (or is not an array). Theoretically this metadata isn't used by DuckDB anyway, so we could just ignore its presence, but that would technically be against the geoparquet standard.

@jutiss
Copy link
Author

jutiss commented Oct 7, 2024

@Maxxen Unfortunately, I don't have all the details of how these parquet files were created, as they have been transferred to me. In the metadata, I see that something called parquet-cpp-arrow version 10.0.1 was used and the person processing them told me he used geopandas.

@Maxxen
Copy link
Member

Maxxen commented Oct 7, 2024

Alright, might be that it was based one of the early drafts of GeoParquet that had not finalized the spec. Either way I'll change it on our end to not throw an error if we can't detect all required geoparquet metadata - in which case we will fall back to reading the file as a normal parquet file.

@Maxxen
Copy link
Member

Maxxen commented Oct 11, 2024

We've now added a setting to control geoparquet conversion, so you can now work around this by doing

SET enable_geoparquet_conversion = false

to keep reading the files as normal parquet, and then perform the WKB conversion manually using ST_GeomFromWKB.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants