Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade GEOS, Improve documentation, Add/Rename ST_Extent_Agg #402

Merged
merged 4 commits into from
Sep 19, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion deps/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -177,7 +177,7 @@ set(GDAL_DEPENDENCIES ${GDAL_DEPENDENCIES} EXPAT)
# GEOS
ExternalProject_Add(
GEOS
URL ${CMAKE_CURRENT_SOURCE_DIR}/vendor/geos-3.12.1.tar.bz2
URL ${CMAKE_CURRENT_SOURCE_DIR}/vendor/geos-3.13.0.tar.bz2
CONFIGURE_HANDLED_BY_BUILD TRUE
CMAKE_ARGS
# CMake options
Expand Down
Binary file removed deps/vendor/geos-3.12.1.tar.bz2
Binary file not shown.
Binary file added deps/vendor/geos-3.13.0.tar.bz2
Binary file not shown.
60 changes: 46 additions & 14 deletions docs/functions.md
Original file line number Diff line number Diff line change
Expand Up @@ -113,7 +113,8 @@

| Function | Summary |
| --- | --- |
| [`ST_Envelope_Agg`](#st_envelope_agg) | Computes a minimal-bounding-box polygon 'enveloping' the set of input geometries |
| [`ST_Envelope_Agg`](#st_envelope_agg) | Alias for [ST_Extent_Agg](#st_extent_agg). |
| [`ST_Extent_Agg`](#st_extent_agg) | Computes the minimal-bounding-box polygon containing the set of input geometries |
| [`ST_Intersection_Agg`](#st_intersection_agg) | Computes the intersection of a set of geometries |
| [`ST_Union_Agg`](#st_union_agg) | Computes the union of a set of input geometries |

Expand All @@ -124,7 +125,7 @@
| [`ST_Drivers`](#st_drivers) | Returns the list of supported GDAL drivers and file formats |
| [`ST_Read`](#st_read) | Read and import a variety of geospatial file formats using the GDAL library. |
| [`ST_ReadOSM`](#st_readosm) | The `ST_ReadOsm()` table function enables reading compressed OpenStreetMap data directly from a `.osm.pbf file.` |
| [`ST_Read_Meta`](#st_read_meta) | Read and the metadata from a variety of geospatial file formats using the GDAL library. |
| [`ST_Read_Meta`](#st_read_meta) | Read the metadata from a variety of geospatial file formats using the GDAL library. |

----

Expand Down Expand Up @@ -342,9 +343,9 @@ Returns the "boundary" of a geometry
#### Signatures

```sql
GEOMETRY ST_Buffer (col0 GEOMETRY, col1 DOUBLE)
GEOMETRY ST_Buffer (col0 GEOMETRY, col1 DOUBLE, col2 INTEGER)
GEOMETRY ST_Buffer (col0 GEOMETRY, col1 DOUBLE, col2 INTEGER, col3 VARCHAR, col4 VARCHAR, col5 DOUBLE)
GEOMETRY ST_Buffer (geom GEOMETRY, distance DOUBLE)
GEOMETRY ST_Buffer (geom GEOMETRY, distance DOUBLE, num_triangles INTEGER)
GEOMETRY ST_Buffer (geom GEOMETRY, distance DOUBLE, num_triangles INTEGER, join_style VARCHAR, cap_style VARCHAR, mitre_limit DOUBLE)
```

#### Description
Expand Down Expand Up @@ -716,7 +717,7 @@ DOUBLE ST_Distance_Spheroid (col0 POINT_2D, col1 POINT_2D)

Returns the distance between two geometries in meters using a ellipsoidal model of the earths surface

The input geometry is assumed to be in the [EPSG:4326](https://en.wikipedia.org/wiki/World_Geodetic_System) coordinate system (WGS84), with [latitude, longitude] axis order and the distance is returned in meters. This function uses the [GeographicLib](https://geographiclib.sourceforge.io/) library to solve the [inverse geodesic problem](https://en.wikipedia.org/wiki/Geodesics_on_an_ellipsoid#Solution_of_the_direct_and_inverse_problems), calculating the distance between two points using an ellipsoidal model of the earth. This is a highly accurate method for calculating the distance between two arbitrary points taking the curvature of the earths surface into account, but is also the slowest.
The input geometry is assumed to be in the [EPSG:4326](https://en.wikipedia.org/wiki/World_Geodetic_System) coordinate system (WGS84), with [latitude, longitude] axis order and the distance limit is expected to be in meters. This function uses the [GeographicLib](https://geographiclib.sourceforge.io/) library to solve the [inverse geodesic problem](https://en.wikipedia.org/wiki/Geodesics_on_an_ellipsoid#Solution_of_the_direct_and_inverse_problems), calculating the distance between two points using an ellipsoidal model of the earth. This is a highly accurate method for calculating the distance between two arbitrary points taking the curvature of the earths surface into account, but is also the slowest.

#### Example

Expand Down Expand Up @@ -1885,12 +1886,12 @@ Returns true if geom1 "touches" geom2
#### Signatures

```sql
BOX_2D ST_Transform (col0 BOX_2D, col1 VARCHAR, col2 VARCHAR)
BOX_2D ST_Transform (col0 BOX_2D, col1 VARCHAR, col2 VARCHAR, col3 BOOLEAN)
POINT_2D ST_Transform (col0 POINT_2D, col1 VARCHAR, col2 VARCHAR)
POINT_2D ST_Transform (col0 POINT_2D, col1 VARCHAR, col2 VARCHAR, col3 BOOLEAN)
GEOMETRY ST_Transform (col0 GEOMETRY, col1 VARCHAR, col2 VARCHAR)
GEOMETRY ST_Transform (col0 GEOMETRY, col1 VARCHAR, col2 VARCHAR, col3 BOOLEAN)
BOX_2D ST_Transform (geom BOX_2D, source_crs VARCHAR, target_crs VARCHAR)
BOX_2D ST_Transform (geom BOX_2D, source_crs VARCHAR, target_crs VARCHAR, always_xy BOOLEAN)
POINT_2D ST_Transform (geom POINT_2D, source_crs VARCHAR, target_crs VARCHAR)
POINT_2D ST_Transform (geom POINT_2D, source_crs VARCHAR, target_crs VARCHAR, always_xy BOOLEAN)
GEOMETRY ST_Transform (geom GEOMETRY, source_crs VARCHAR, target_crs VARCHAR)
GEOMETRY ST_Transform (geom GEOMETRY, source_crs VARCHAR, target_crs VARCHAR, always_xy BOOLEAN)
```

#### Description
Expand Down Expand Up @@ -2201,7 +2202,38 @@ GEOMETRY ST_Envelope_Agg (col0 GEOMETRY)

#### Description

Computes a minimal-bounding-box polygon 'enveloping' the set of input geometries
Alias for [ST_Extent_Agg](#st_extent_agg).

Computes the minimal-bounding-box polygon containing the set of input geometries.

#### Example

```sql
SELECT ST_Extent_Agg(geom) FROM UNNEST([ST_Point(1,1), ST_Point(5,5)]) AS _(geom);
-- POLYGON ((1 1, 1 5, 5 5, 5 1, 1 1))
```

----

### ST_Extent_Agg


#### Signature

```sql
GEOMETRY ST_Extent_Agg (col0 GEOMETRY)
```

#### Description

Computes the minimal-bounding-box polygon containing the set of input geometries

#### Example

```sql
SELECT ST_Extent_Agg(geom) FROM UNNEST([ST_Point(1,1), ST_Point(5,5)]) AS _(geom);
-- POLYGON ((1 1, 1 5, 5 5, 5 1, 1 1))
```

----

Expand Down Expand Up @@ -2371,7 +2403,7 @@ ST_Read_Meta (col0 VARCHAR[])

#### Description

Read and the metadata from a variety of geospatial file formats using the GDAL library.
Read the metadata from a variety of geospatial file formats using the GDAL library.

The `ST_Read_Meta` table function accompanies the `ST_Read` table function, but instead of reading the contents of a file, this function scans the metadata instead.
Since the data model of the underlying GDAL library is quite flexible, most of the interesting metadata is within the returned `layers` column, which is a somewhat complex nested structure of DuckDB `STRUCT` and `LIST` types.
Expand Down
4 changes: 2 additions & 2 deletions spatial/include/spatial/core/functions/aggregate.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -8,11 +8,11 @@ namespace core {
struct CoreAggregateFunctions {
public:
static void Register(DatabaseInstance &db) {
RegisterStEnvelopeAgg(db);
RegisterStExtentAgg(db);
}

private:
static void RegisterStEnvelopeAgg(DatabaseInstance &db);
static void RegisterStExtentAgg(DatabaseInstance &db);
};

} // namespace core
Expand Down
2 changes: 2 additions & 0 deletions spatial/include/spatial/core/index/rtree/rtree_node.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -184,7 +184,9 @@ struct alignas(RTreeEntry) RTreeNode {
private:
uint32_t count;

public:
// We got 20 bytes for the future
// make this public so compiler stops warning about unused fields
uint8_t _unused[20] = {};
};

Expand Down
2 changes: 2 additions & 0 deletions spatial/include/spatial/doc_util.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,8 @@ struct DocUtil {
}
AddDocumentation(db, function_name, description, example, tag_map);
}

static void AddFunctionParameterNames(duckdb::DatabaseInstance &db, const char *function_name, duckdb::vector<duckdb::string> names);
};

} // namespace spatial
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
set(EXTENSION_SOURCES
${EXTENSION_SOURCES}
${CMAKE_CURRENT_SOURCE_DIR}/st_envelope_agg.cpp
${CMAKE_CURRENT_SOURCE_DIR}/st_extent_agg.cpp
PARENT_SCOPE
)
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ namespace spatial {

namespace core {

struct EnvelopeAggState {
struct ExtentAggState {
bool is_set;
double xmin;
double xmax;
Expand All @@ -22,7 +22,7 @@ struct EnvelopeAggState {
//------------------------------------------------------------------------
// ENVELOPE AGG
//------------------------------------------------------------------------
struct EnvelopeAggFunction {
struct ExtentAggFunction {
template <class STATE>
static void Initialize(STATE &state) {
state.is_set = false;
Expand Down Expand Up @@ -90,24 +90,37 @@ struct EnvelopeAggFunction {
//------------------------------------------------------------------------------
static constexpr DocTag DOC_TAGS[] = {{"ext", "spatial"}, {"category", "construction"}};
static constexpr const char *DOC_DESCRIPTION = R"(
Computes a minimal-bounding-box polygon 'enveloping' the set of input geometries
Computes the minimal-bounding-box polygon containing the set of input geometries
)";
static constexpr const char *DOC_EXAMPLE = R"(
SELECT ST_Extent_Agg(geom) FROM UNNEST([ST_Point(1,1), ST_Point(5,5)]) AS _(geom);
-- POLYGON ((1 1, 1 5, 5 5, 5 1, 1 1))
)";

static constexpr const char* DOC_ALIAS_DESCRIPTION = R"(
Alias for [ST_Extent_Agg](#st_extent_agg).

Computes the minimal-bounding-box polygon containing the set of input geometries.
)";

//------------------------------------------------------------------------
// Register
//------------------------------------------------------------------------
void CoreAggregateFunctions::RegisterStEnvelopeAgg(DatabaseInstance &db) {
void CoreAggregateFunctions::RegisterStExtentAgg(DatabaseInstance &db) {

auto function = AggregateFunction::UnaryAggregate<ExtentAggState, geometry_t, geometry_t, ExtentAggFunction>(
GeoTypes::GEOMETRY(), GeoTypes::GEOMETRY());

// Register the function
function.name = "ST_Extent_Agg";
ExtensionUtil::RegisterFunction(db, function);
DocUtil::AddDocumentation(db, "ST_Extent_Agg", DOC_DESCRIPTION, DOC_EXAMPLE, DOC_TAGS);

AggregateFunctionSet st_envelope_agg("ST_Envelope_Agg");
st_envelope_agg.AddFunction(
AggregateFunction::UnaryAggregate<EnvelopeAggState, geometry_t, geometry_t, EnvelopeAggFunction>(
GeoTypes::GEOMETRY(), GeoTypes::GEOMETRY()));
// Also add an alias with the name ST_Envelope_Agg
function.name = "ST_Envelope_Agg";
ExtensionUtil::RegisterFunction(db, function);
DocUtil::AddDocumentation(db, "ST_Envelope_Agg", DOC_ALIAS_DESCRIPTION, DOC_EXAMPLE, DOC_TAGS);

ExtensionUtil::RegisterFunction(db, st_envelope_agg);
DocUtil::AddDocumentation(db, "ST_Envelope_Agg", DOC_DESCRIPTION, DOC_EXAMPLE, DOC_TAGS);
}

} // namespace core
Expand Down
2 changes: 1 addition & 1 deletion spatial/src/spatial/gdal/functions/st_read_meta.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -211,7 +211,7 @@ static void Scan(ClientContext &context, TableFunctionInput &input, DataChunk &o
static constexpr DocTag DOC_TAGS[] = {{"ext", "spatial"}};

static constexpr const char *DOC_DESCRIPTION = R"(
Read and the metadata from a variety of geospatial file formats using the GDAL library.
Read the metadata from a variety of geospatial file formats using the GDAL library.

The `ST_Read_Meta` table function accompanies the `ST_Read` table function, but instead of reading the contents of a file, this function scans the metadata instead.
Since the data model of the underlying GDAL library is quite flexible, most of the interesting metadata is within the returned `layers` column, which is a somewhat complex nested structure of DuckDB `STRUCT` and `LIST` types.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ static void GeodesicPoint2DFunction(DataChunk &args, ExpressionState &state, Vec
static constexpr const char *DOC_DESCRIPTION = R"(
Returns the distance between two geometries in meters using a ellipsoidal model of the earths surface

The input geometry is assumed to be in the [EPSG:4326](https://en.wikipedia.org/wiki/World_Geodetic_System) coordinate system (WGS84), with [latitude, longitude] axis order and the distance is returned in meters. This function uses the [GeographicLib](https://geographiclib.sourceforge.io/) library to solve the [inverse geodesic problem](https://en.wikipedia.org/wiki/Geodesics_on_an_ellipsoid#Solution_of_the_direct_and_inverse_problems), calculating the distance between two points using an ellipsoidal model of the earth. This is a highly accurate method for calculating the distance between two arbitrary points taking the curvature of the earths surface into account, but is also the slowest.
The input geometry is assumed to be in the [EPSG:4326](https://en.wikipedia.org/wiki/World_Geodetic_System) coordinate system (WGS84), with [latitude, longitude] axis order and the distance limit is expected to be in meters. This function uses the [GeographicLib](https://geographiclib.sourceforge.io/) library to solve the [inverse geodesic problem](https://en.wikipedia.org/wiki/Geodesics_on_an_ellipsoid#Solution_of_the_direct_and_inverse_problems), calculating the distance between two points using an ellipsoidal model of the earth. This is a highly accurate method for calculating the distance between two arbitrary points taking the curvature of the earths surface into account, but is also the slowest.
)";

static constexpr const char *DOC_EXAMPLE = R"(
Expand Down
2 changes: 2 additions & 0 deletions spatial/src/spatial/geos/functions/scalar/st_buffer.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -131,6 +131,8 @@ void GEOSScalarFunctions::RegisterStBuffer(DatabaseInstance &db) {

ExtensionUtil::RegisterFunction(db, set);
DocUtil::AddDocumentation(db, "ST_Buffer", DOC_DESCRIPTION, DOC_EXAMPLE, DOC_TAGS);
DocUtil::AddFunctionParameterNames(db, "ST_Buffer", {"geom", "distance", "num_triangles", "join_style", "cap_style",
"mitre_limit"});
}

} // namespace geos
Expand Down
1 change: 1 addition & 0 deletions spatial/src/spatial/proj/functions.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -511,6 +511,7 @@ void ProjFunctions::Register(DatabaseInstance &db) {

ExtensionUtil::RegisterFunction(db, set);
DocUtil::AddDocumentation(db, "ST_Transform", DOC_DESCRIPTION, DOC_EXAMPLE, DOC_TAGS);
DocUtil::AddFunctionParameterNames(db, "ST_Transform", {"geom", "source_crs", "target_crs", "always_xy"});

GenerateSpatialRefSysTable::Register(db);
}
Expand Down
24 changes: 24 additions & 0 deletions spatial/src/spatial_extension.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -84,6 +84,30 @@ void spatial::DocUtil::AddDocumentation(duckdb::DatabaseInstance &db, const char
}
}

void spatial::DocUtil::AddFunctionParameterNames(duckdb::DatabaseInstance &db, const char *function_name,
duckdb::vector<duckdb::string> names) {
auto &system_catalog = Catalog::GetSystemCatalog(db);
auto data = CatalogTransaction::GetSystemTransaction(db);
auto &schema = system_catalog.GetSchema(data, DEFAULT_SCHEMA);
auto catalog_entry = schema.GetEntry(data, CatalogType::SCALAR_FUNCTION_ENTRY, function_name);
if (!catalog_entry) {
// Try get a aggregate function
catalog_entry = schema.GetEntry(data, CatalogType::AGGREGATE_FUNCTION_ENTRY, function_name);
if (!catalog_entry) {
// Try get a table function
catalog_entry = schema.GetEntry(data, CatalogType::TABLE_FUNCTION_ENTRY, function_name);
if (!catalog_entry) {
throw duckdb::InvalidInputException("Function with name \"%s\" not found in DocUtil::AddDocumentation",
function_name);
}
}
}

auto &func_entry = catalog_entry->Cast<FunctionEntry>();
func_entry.parameter_names = std::move(names);
}


namespace duckdb {

static void LoadInternal(DatabaseInstance &instance) {
Expand Down
Loading