-
-
Notifications
You must be signed in to change notification settings - Fork 265
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GeoJSON schemas generation #1349
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @MTachon, I like the idea but on the other hand I'd rather if you can move the change for pydantic 2 into a separate PR. This would make the PR cleaner and better contextualised for the its title's scope
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes related to the migration to pydantic V2 are now moved to PR #1353
Hi @MTachon, thanks for the move. It would have been better and cleaner a rebase with the master branch instead of a merge. Let's wait for the review from @tomkralidis |
…ors.py Allow to specify custom validators for the fields of the pydantic models to create
…nthough it can be empty
…eature_collection_model
Will clean up the branch. |
fdec247
to
b511c05
Compare
Force pushed the rebased branch. |
+1, let's wait for @tomkralidis review |
Use 'None'as default value for the 'bbox' property, as typing.Optional is not JSON serializable. Use Literal[None] instead of Optional[None] for 'geometry' and 'properties' properties, when the create_geojson_feature* functions are called with their 'geom_type' and 'properties' parameters set to 'None', repectively.
Subclass pydantic.json_schema.GenerateJsonSchema, and take care of removing 'default' values for 'bbox' and 'id' properties when generating JSON schema for GeoJSON Feature and GeoJSON FeatureCollection.
8c1d205
to
d4bfcd3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @MTachon, can we move this to a plugin? (cc @tomkralidis)
Something like pygeoapi/provider/vector_data_validator.py
or similar
@francbartoli and @tomkralidis , Some reflections after discussing with @francbartoli:
|
Some comments:
In summary, I think the above can be realized without using pydantic, in the interest of robust and long term sustainability. If this is not possible, this could also be added as a 'validating' provider that other providers can inherit from if they so choose. |
Thanks for the feedbacks @tomkralidis !
I am sure this could be implemented without pydantic. But I think this is a trade-off between not relying on an additional external dependency on the one hand, and simpler/more compact implementation and more flexible/powerful data validation on the other hand, IMHO. |
I see that I am not checking for clockwise vs. counterclockwise direction for linear rings, as mentioned in https://www.rfc-editor.org/rfc/rfc7946#section-3.1.6, in the default validator functions. I guess I am better off with using shapely's validation functions. |
As per RFC4, this Pull Request has been inactive for 90 days. In order to manage maintenance burden, it will be automatically closed in 7 days. |
As per RFC4, this Pull Request has been closed due to there being no activity for more than 90 days. |
Overview
This PR builds upon PR #1022 , and intends to provide a standard way/facilitate the schema generation for providers, which publish GeoJSON data. It makes use of pydantic models, that can be used to generate the corresponding JSON schemas. Pydantic models can also help with validating incoming data (for transactions), beyond what JSON schemas allows for (e.g closed linear rings(s) for a valid
GeoJSON Polygon
, invalid geometry checks...) with custom validator functions.Several helper functions are implemented, which can be used in the
get_schema()
method of providers:pygeoapi/models/geojson.py
:create_geojson_geometry_model()
: creates a pydantic model for a GeoJSON Geometrycreate_geojson_feature_model()
: creates a pydantic model for a GeoJSON Featurecreate_geojson_feature_collection_model()
: creates a pydantic model for a GeoJSON FeatureCollectionNOTE: These helper functions return pydantic models which have default validator functions. These validator functions (defined in
pygeoapi/models/validators.py
) check that all GeoJSON geometries of typePolygon
have closed linear rings in a GeoJSONGeometry
/Feature
/FeatureCollection
. The validator functions are called when themodel_validate()
method of the output pydantic models are called, and can be overwritten with thefield_validators
parameter of thecreate_geojson_geometry_model()
,create_geojson_feature_model()
andcreate_geojson_feature_collection_model()
functions.pygeoapi/schemas.py
:get_geojson_feature_schema()
: creates the JSON schema for GeoJSON Featureget_geojson_feature_collection_schema()
: creates the JSON schema for GeoJSON FeatureCollectionNOTE: These are shorthand function to directly create JSON schemas generated from pydantic models. They create the appropriate pydantic models by calling one of the functions from
pygeoapi/models/geojson.py
, and call theirmodel_json_schema()
method. In the schema generation process, default values are removed forbbox
andid
properties.The following code shows how vector data providers can implement their
get_schema()
method:Ideally, the
get_fields()
method of providers could be extended so that it returns the list of GeoJSONProperty directly, or a customget_geojson_properties()
method could be used instead. Either should take care of thenullable
andrequired
parameters.This PR also opens up for defining a
get_data_model(type_: Literal['Feature', 'FeatureCollection'])
abstract method in theBaseProvider
, which can be implemented in providers. Theget_data_model()
would call one of thecreate_geojson_feature_model()
orcreate_geojson_feature_collection_model()
functions and return the appropriate pydantic models. For supporting feature transactions, themodel_validate()
method of the pydantic models can be called directly inpygeoapi.api.manage_collection_item()
to validate/invalidate incoming data. The validation with pydantic models is more flexible and powerful than that of JSON schemas, as mentioned above.Related Issue / Discussion
Additional Information
geojson-pydantic
was considered. It does not seem to play well with pydantic v2 right now. In addition, dumping the JSON schema of ageojson_pydantic.Feature
instance results in a valid JSON schema for a GeoJSON Feature in general, which cannot be used if we, for example, want to constrain a specific geometry type (e.g.Point
), and let the end-users know which geometry type is expected through the OpenAPI document. If this changes and that we are willing to add another dependency to pygeoapi, we may refactor the code to usegeojson-pydantic
.Contributions and Licensing
(as per https://github.com/geopython/pygeoapi/blob/master/CONTRIBUTING.md#contributions-and-licensing)