Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

inconsistent parsing of subsystems #803

Open
mihai-sysbio opened this issue Dec 6, 2023 · 9 comments
Open

inconsistent parsing of subsystems #803

mihai-sysbio opened this issue Dec 6, 2023 · 9 comments
Assignees
Labels
feature New feature/functionality minor Easy fix with low priority, likely just cosmetic quality improves maintainability and code clarity
Milestone

Comments

@mihai-sysbio
Copy link

Minimal code example to reproduce the problem

It looks like for some file formats the subsystem field might not be parsed. I've spotted this from two different models in SBML format, so I checked with the core E.coli.

using COBREXA
!isfile("e_coli_core.mat") && download("http://bigg.ucsd.edu/static/models/e_coli_core.mat", "e_coli_core.mat");
!isfile("e_coli_core.json") && download("http://bigg.ucsd.edu/static/models/e_coli_core.json", "e_coli_core.json");
!isfile("e_coli_core.xml") && download("http://bigg.ucsd.edu/static/models/e_coli_core.xml", "e_coli_core.xml");
json = load_model(StandardModel, "e_coli_core.json")
xml = load_model(StandardModel, "e_coli_core.xml")
mat = load_model(StandardModel, "e_coli_core.mat")
foreach(model -> println(length(filter(!isnothing, map(r -> reaction_subsystem(model, r), reactions(model))))), [json, xml, mat])

Expected result

I would expect the same number of subsystems present in the model:

95
95
95

Actual behavior

Only the JSON format returns 95 entries, for the others it's 0:

95
0
0
@exaexa
Copy link
Collaborator

exaexa commented Dec 6, 2023

Hi,

good point, this is a problem. Unfortunately I don't see how to fix this for the SBML models (SBML doesn't specify subsystems, maybe there would be a way to parse them from MIRIAM-style annotations or so?). But the .mat format indeed contains the subsystems directly in field subSystem. We will add the functionality.

Extra TODO: mirror the resulting patch to here: https://github.com/COBREXA/MATFBCModels.jl (we're slowly flipping&reorganizing everything over to the separate github org)

Thanks for the report!
-mk

@exaexa exaexa added quality improves maintainability and code clarity feature New feature/functionality minor Easy fix with low priority, likely just cosmetic labels Dec 6, 2023
@exaexa exaexa self-assigned this Dec 6, 2023
@exaexa
Copy link
Collaborator

exaexa commented Dec 6, 2023

@stelmo btw should we add subsystems to AbstractFBC interface? I recall we talked about that and it might be the case that I killed the idea with some argument that I don't recall anymore 😅 likely about the subsystems not being proper annotations and reactions not being able to have multiple subsystems or so.... 😅 💦

EDIT: my emojis don't seem to work, weird.

@mihai-sysbio
Copy link
Author

how to fix this for the SBML models (SBML doesn't specify subsystems, maybe there would be a way to parse them from MIRIAM-style annotations or so?)

While the subsystem concept does not have the same type of encoding as the compartment, it seems it iss captured in the SBML file via <groups:group sboTerm="SBO:0000633".

@mihai-sysbio
Copy link
Author

A bit of a random idea, have you considered introducing tests to see if models from trusted sourced imported in different formats yield the same StandardModel?

@exaexa
Copy link
Collaborator

exaexa commented Dec 6, 2023

how to fix this for the SBML models (SBML doesn't specify subsystems, maybe there would be a way to parse them from MIRIAM-style annotations or so?)

While the subsystem concept does not have the same type of encoding as the compartment, it seems it iss captured in the SBML file via <groups:group sboTerm="SBO:0000633".

Ah nice, is there a known model with this? (I don't see it in the ec core).

A bit of a random idea, have you considered introducing tests to see if models from trusted sourced imported in different formats yield the same StandardModel?

Well, yes, and it unfortunately failed quite hard. Maybe we should retry now with the AbstractFBC interface in place which defines a much less opinionated (and thus more realizable) standard...

@mihai-sysbio
Copy link
Author

Ah nice, is there a known model with this? (I don't see it in the ec core).

I actually found the SBO term by looking for glycolysis in Human-GEM. I would expect this to be present in yeast-GEM too. The SBO number didn't mean much to me, but seeing that it's about "subsystem group", things start to make sense.

@exaexa
Copy link
Collaborator

exaexa commented Dec 6, 2023

Uuuh good I see it in YeastGEM now. That sounds very vital, thanks!

For the SBML import/export, can we assume that

  • only the groups with the SBO 633 are subsystems? (I'd say we can safely ignore the others)
  • there aren't going to be overlaps of groups? (I'd say no here, eventually someone will do this. :D )

@exaexa exaexa added this to the v2.0.x milestone Dec 9, 2023
@exaexa
Copy link
Collaborator

exaexa commented Dec 11, 2023

tracking:
SBML.jl support here: LCSB-BioCore/SBML.jl#260

@mihai-sysbio
Copy link
Author

For the SBML import/export, can we assume that

* only the groups with the SBO 633 are subsystems? (I'd say we can safely ignore the others)

Yes, this is the SBO term for subsystem https://identifiers.org/SBO:0000633

* there aren't going to be overlaps of groups? (I'd say no here, eventually someone will do this. :D )

Yeast-GEM has shifted from one-many to one-one when it comes to the reaction-subsystem annotation SysBioChalmers/yeast-GEM#307

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature/functionality minor Easy fix with low priority, likely just cosmetic quality improves maintainability and code clarity
Projects
None yet
Development

No branches or pull requests

2 participants