Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue #134 merge partitions with inconsistent grids amongst partitions #216

Merged
merged 27 commits into from
Feb 14, 2024
Merged
Show file tree
Hide file tree
Changes from 16 commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
65c34b7
Start working on issue, deactivate some breaking validation, add some…
JoerivanEngelen Feb 9, 2024
4062710
Merge branch 'main' into issue_134_merge_partitions_1D2D
JoerivanEngelen Feb 12, 2024
92ab064
* Add ``sizes`` property
JoerivanEngelen Feb 12, 2024
9866cb2
Update changelog
JoerivanEngelen Feb 12, 2024
e2bd159
* Support merging partitions with different grids per partition
JoerivanEngelen Feb 12, 2024
ba6e259
Format
JoerivanEngelen Feb 12, 2024
151f416
Update changelog
JoerivanEngelen Feb 12, 2024
68b2d94
Add test
JoerivanEngelen Feb 12, 2024
8ab89ad
Support merging partitions with inconsistent grids
JoerivanEngelen Feb 13, 2024
30d12b9
Fix comments and drop mesh1d_nEdges in paritition as well.
JoerivanEngelen Feb 13, 2024
5815063
Fix validation
JoerivanEngelen Feb 13, 2024
147f3d4
Add test
JoerivanEngelen Feb 13, 2024
edfa12b
Add validation if vars in all data objects and ensure all other_vars …
JoerivanEngelen Feb 13, 2024
1570561
Add extra tests and adapt some tests
JoerivanEngelen Feb 13, 2024
92fd23f
format
JoerivanEngelen Feb 13, 2024
0c85d0b
Update changelog
JoerivanEngelen Feb 13, 2024
22d93a2
Rename max_connectivity_dimensions to max_connectivity_sizes and let …
JoerivanEngelen Feb 14, 2024
8c465bc
Remove duplicate message
JoerivanEngelen Feb 14, 2024
694c31c
Add type annotations and add return None
JoerivanEngelen Feb 14, 2024
c925667
format
JoerivanEngelen Feb 14, 2024
0eb60f4
type annotate separate_variables and fix comment
JoerivanEngelen Feb 14, 2024
96f814b
Add type annotation
JoerivanEngelen Feb 14, 2024
d9b3494
Remove useless filter loop
JoerivanEngelen Feb 14, 2024
4b262e4
Type annotate merge_partitions
JoerivanEngelen Feb 14, 2024
365ee19
Simplify logic with reviewer's suggestions
JoerivanEngelen Feb 14, 2024
20181c3
format
JoerivanEngelen Feb 14, 2024
a96d298
Fix typo in docstring
JoerivanEngelen Feb 14, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions docs/changelog.rst
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,8 @@ Fixed
copies).
- Fixed bug in :meth:`xugrid.Ugrid1d.merge_partitions`, which caused
``ValueError: indexes must be provided for attrs``.
- :func:`xugrid.merge_partitions` merges partitions with grids not contained in
other partitions.
JoerivanEngelen marked this conversation as resolved.
Show resolved Hide resolved

Added
~~~~~
Expand All @@ -31,12 +33,19 @@ Added
UGRID topologies from "intervals": the (M + 1, N + 1) vertex coordinates for N faces.
- :meth:`xugrid.UgridDataArrayAccessor.from_structured` now takes ``x`` and ``y``
arguments to specify which coordinates to use as the UGRID x and y coordinates.
- :attr:`xugrid.UgridDataset.sizes` as an alternative to :attr:`xugrid.UgridDataset.dimensions`
JoerivanEngelen marked this conversation as resolved.
Show resolved Hide resolved
- :attr:`xugrid.Ugrid2d.max_face_node_dimension` which returns the dimension
name designating nodes per face.
- :attr:`xugrid.AbstractUgrid.max_connectivity_dimensions` which returns all
JoerivanEngelen marked this conversation as resolved.
Show resolved Hide resolved
maximum connectivity dimensions and their corresponding size.

Changed
~~~~~~~

- :meth:`xugrid.Ugrid2d.from_structured` now takes ``x`` and ``y`` arguments instead
of ``x_bounds`` and ``y_bounds`` arguments.
- :func:`xugrid.merge_partitions` allows merging partitions with different grids
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uh, this is added under fixed and changed? Maybe just mention it only for "fixed"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was a bit in doubt:

I added it to changed, because it is a change of behavior: Prior to this changeset, we test explicitly if all grids occur in every partitions and throw a error if this is not the case. Therefore this changeset doesn't really "fix" a bug, it alters behavior.

On the other hand, the changeset brings xugrid closer to supporting UGRID conventions, so if the premise of xugrid is to support every UGRID file (at least: those without mistakes), you can argue this is a "fix".

For now, I just kept the text under "changed".

per partition.

[0.8.1] 2024-01-19
------------------
Expand Down
75 changes: 60 additions & 15 deletions tests/test_partitioning.py
Original file line number Diff line number Diff line change
Expand Up @@ -139,12 +139,15 @@ def test_merge_partitions__errors(self):

grid1 = partitions[1].ugrid.grid
partitions[1]["extra"] = (grid1.face_dimension, np.ones(grid1.n_face))
with pytest.raises(ValueError, match="These variables are present"):
with pytest.raises(
ValueError,
match="'extra' does not occur not in all partitions with 'mesh2d'",
):
pt.merge_partitions(partitions)

partitions = self.uds.ugrid.partition(n_part=2)
partitions[1]["face_z"] = partitions[1]["face_z"].expand_dims("layer", axis=0)
with pytest.raises(ValueError, match="Dimensions for face_z do not match"):
with pytest.raises(ValueError, match="Dimensions for 'face_z' do not match"):
pt.merge_partitions(partitions)

uds = self.uds.copy()
Expand Down Expand Up @@ -188,13 +191,29 @@ def test_merge_partitions(self):
merged = pt.merge_partitions(self.datasets)
assert isinstance(merged, xu.UgridDataset)
assert len(merged.ugrid.grids) == 2
# In case of non-UGRID data, it should default to the first partition:
assert merged["c"] == 0
# In case of non-UGRID data, it should default to the last partition:
assert merged["c"] == 1

def test_merge_partitions__errors(self):
assert len(merged["first_nFaces"]) == 6
assert len(merged["second_nFaces"]) == 20

def test_merge_partitions__unique_grid_per_partition(self):
pa = self.datasets[0][["a"]]
pb = self.datasets[1][["b"]]
with pytest.raises(ValueError, match="Expected 2 UGRID topologies"):
merged = pt.merge_partitions([pa, pb])

assert isinstance(merged, xu.UgridDataset)
assert len(merged.ugrid.grids) == 2

assert len(merged["first_nFaces"]) == 3
assert len(merged["second_nFaces"]) == 10

def test_merge_partitions__errors(self):
pa = self.datasets[0][["a"]] * xr.DataArray([1.0, 1.0], dims=("error_dim",))
pb = self.datasets[1][["a"]]
with pytest.raises(
ValueError, match="Dimensions for 'a' do not match across partitions: "
):
pt.merge_partitions([pa, pb])

grid_a = self.datasets[1].ugrid.grids[0].copy()
Expand Down Expand Up @@ -248,7 +267,7 @@ def setup(self):

ds_expected = xu.UgridDataset(grids=[grid])
ds_expected["a"] = ((grid.edge_dimension), np.concatenate(values_parts))
ds_expected["c"] = 0
ds_expected["c"] = 1
# Assign coordinates also added during merge_partitions
coords = {grid.edge_dimension: np.arange(grid.n_edge)}
ds_expected = ds_expected.assign_coords(**coords)
Expand All @@ -260,8 +279,9 @@ def test_merge_partitions(self):
merged = pt.merge_partitions(self.datasets_partitioned)
assert isinstance(merged, xu.UgridDataset)
assert len(merged.ugrid.grids) == 1
# In case of non-UGRID data, it should default to the first partition:
assert merged["c"] == 0
# In case of non-UGRID data, it should default to the last partition of
# the grid that's checked last.
assert merged["c"] == 1

assert self.dataset_expected.ugrid.grid.equals(merged.ugrid.grid)
assert self.dataset_expected["a"].equals(merged["a"])
Expand Down Expand Up @@ -289,16 +309,22 @@ def setup(self):
ds["a"] = ((part_a.face_dimension), values_a)
ds["b"] = ((part_b.edge_dimension), values_b)
ds["c"] = i
datasets_parts.append(ds)

coords = {
part_a.face_dimension: values_a,
part_b.edge_dimension: values_b,
}

datasets_parts.append(ds.assign_coords(**coords))

ds_expected = xu.UgridDataset(grids=[grid_a, grid_b])
ds_expected["a"] = ((grid_a.face_dimension), np.concatenate(values_parts_a))
ds_expected["b"] = ((grid_b.edge_dimension), np.concatenate(values_parts_b))
ds_expected["c"] = 0
ds_expected["c"] = 1
# Assign coordinates also added during merge_partitions
coords = {
grid_a.face_dimension: np.arange(grid_a.n_face),
grid_b.edge_dimension: np.arange(grid_b.n_edge),
grid_a.face_dimension: np.concatenate(values_parts_a),
grid_b.edge_dimension: np.concatenate(values_parts_b),
}
ds_expected = ds_expected.assign_coords(**coords)

Expand All @@ -309,7 +335,26 @@ def test_merge_partitions(self):
merged = pt.merge_partitions(self.datasets_parts)
assert isinstance(merged, xu.UgridDataset)
assert len(merged.ugrid.grids) == 2
# In case of non-UGRID data, it should default to the first partition:
assert merged["c"] == 0
# In case of non-UGRID data, it should default to the last partition of
# the grid that's checked last.
assert merged["c"] == 1

assert self.dataset_expected.equals(merged)

def test_merge_partitions__inconsistent_grid_types(self):
self.datasets_parts[0] = self.datasets_parts[0].drop_vars(
["b", "mesh1d_nEdges"]
)
b = self.dataset_expected["b"].isel(mesh1d_nEdges=[0, 1, 2])
self.dataset_expected = self.dataset_expected.drop_vars(["b", "mesh1d_nEdges"])
self.dataset_expected["b"] = b
self.dataset_expected["c"] = 1

merged = pt.merge_partitions(self.datasets_parts)
assert isinstance(merged, xu.UgridDataset)
assert len(merged.ugrid.grids) == 2
# In case of non-UGRID data, it should default to the last partition of
# the grid that's checked last.
assert merged["c"] == 1

assert self.dataset_expected.equals(merged)
Loading
Loading