Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

concat_models #49

Closed
wants to merge 8 commits into from
Closed

concat_models #49

wants to merge 8 commits into from

Conversation

d2hydro
Copy link
Contributor

@d2hydro d2hydro commented Jan 9, 2024

@d2hydro d2hydro requested a review from visr January 9, 2024 11:19
@d2hydro d2hydro self-assigned this Jan 9, 2024
@d2hydro
Copy link
Contributor Author

d2hydro commented Jan 9, 2024

@visr, some things to consider solving in the Model.merge_model method (

# update node_id column if exists
if "node_id" in model.network.node.df.columns:
model.network.node.df["node_id"] = model.network.node.df.index
# fix edge-fid if exists
model.network.edge.df.reset_index(inplace=True, drop=True)
)?

@SouthEndMusic
Copy link

@d2hydro nice that you are already using model merging. However, why do you need the above index modifications? In Ribasim python we don't use those indices anywhere

@d2hydro
Copy link
Contributor Author

d2hydro commented Jan 9, 2024

@d2hydro nice that you are already using model merging. However, why do you need the above index modifications? In Ribasim python we don't use those indices anywhere

@SouthEndMusic, I read and merge a list of models iteratively. Issues arise while writing the merged model to filepath. Two issues arise:

  1. One model has a column 'node_id', the others not. The merged model has has nan-values in the column node_id. When with 'mode.write(filepath)' the validator gives a validation-error on validing the node-id column (due to nan-values)
  2. If 1 is fixed, pyogrio doesn't want to write edges as the index in edges is not unique. That is correct; model.network.edge.df.index.duplicated().any() gives True.

You can reproduce the above behavior with these two models: https://we.tl/t-sscLwd3Vhw

@SouthEndMusic
Copy link

One model has a column 'node_id', the others not. The merged model has has nan-values in the column node_id. When with 'mode.write(filepath)' the validator gives a validation-error on validing the node-id column (due to nan-values)
If 1 is fixed, pyogrio doesn't want to write edges as the index in edges is not unique. That is correct; model.network.edge.df.index.duplicated().any() gives True.

You can reproduce the above behavior with these two models: https://we.tl/t-sscLwd3Vhw

Thanks for the response. I fixed the second point (in an upcoming commit), but I'm unsure about the first. node_id is not a standard column of the Node table, not sure why that doesn't raise validation errors. The error you mentioned is still triggered because the method which extracts node IDs from a table only looks for a node_id column , even if it shouldn't be there.

Since Deltares/Ribasim#794 extra columns are prefixed with meta_, but maybe that does not happen when a model is read into ribasim Python from disk. @evetion?

@SouthEndMusic
Copy link

@d2hydro the first problem will not be fixed in my PR, I made a new issue: Deltares/Ribasim#934.

@evetion
Copy link
Member

evetion commented Jan 11, 2024

One model has a column 'node_id', the others not. The merged model has has nan-values in the column node_id. When with 'mode.write(filepath)' the validator gives a validation-error on validing the node-id column (due to nan-values)

I can't replicate this validation error on writing. Note that with the new PR, the node_id will be renamed meta_node_id automatically.

@d2hydro
Copy link
Contributor Author

d2hydro commented Jan 11, 2024

@SouthEndMusic and @evetion, that is all OK, and I am looking forward to the meta_{colomn_name} implementation. Is it possible to merge #ribasim:914 and release it to a higher ribasim-version asap? I want to update the toml in this branch before merging to main so that the concat-function works.

@evetion
Copy link
Member

evetion commented Jan 11, 2024

@SouthEndMusic and @evetion, that is all OK, and I am looking forward to the meta_{colomn_name} implementation. Is it possible to merge #ribasim:914 and release it to a higher ribasim-version asap? I want to update the toml in this branch before merging to main so that the concat-function works.

We're working on that now, I expect a new release tomorrow.

@SouthEndMusic
Copy link

@d2hydro I can now merge and write your models without problems on the merging_models branch due to the latest fix by @evetion:

model1 = ribasim.Model.read(datadir / "HHNK_Ribasim/HHNK.toml")
model2 = ribasim.Model.read(datadir / "rijkswateren/ribasim.toml")
model1.smart_merge(model2)

model1.write(datadir / "merged/ribasim.toml")

@d2hydro
Copy link
Contributor Author

d2hydro commented Jan 11, 2024

smart_merge

@SouthEndMusic and @evetion, in concat I give every merged model Node and Edge some extra attributes (for now waterbeheerder and zoom_level). I have to make that meta_waterbeheerder and meta_zoom_level in order not to crash:

def add_attributes(model, idx):
if attributes is not None:
for k in attributes.keys():
model.network.node.df[f"meta_{k}"] = attributes[k][idx]
model.network.edge.df[f"meta_{k}"] = attributes[k][idx]
return model

I don't know if you want to fix this, but this will give an Exception:

model1 = ribasim.Model.read(datadir / "HHNK_Ribasim/HHNK.toml")
model1.network.node.df["foo"] = "bar"
model2 = ribasim.Model.read(datadir / "rijkswateren/ribasim.toml")
model2.network.node.df["foo"] = "bar"

model1.smart_merge(model2)

@d2hydro d2hydro closed this Feb 19, 2024
@d2hydro d2hydro deleted the concat_models branch February 19, 2024 07:22
@d2hydro d2hydro removed their assignment Mar 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

concat Ribasim-Models
4 participants