Add find_active_reactions.py to find active reactions quicker #841

shjchan · 2019-04-19T06:45:40Z

Add find_active_reactions.py and a test_find_active_reactions.py.
Had some discussion with @ChristianLieven and @Midnighter before. Said it might be nice to add these functions for finding active reactions (primarily for finding reactions in loops) which solve an MILP or a few LPs instead of doing FVA. From tests on a few different models, comparing to using cobra.flux_analysis.find_blocked_reactions.py, it can be 3 - 5x faster. Would be good to suggest ways to streamline the code and improve speed. I have been using cobrapy from time to time but not very into its infrastructure. The test may need changes too.

codecov-io · 2019-04-19T15:14:09Z

Codecov Report

Merging #841 into devel will decrease coverage by 0.46%.
The diff coverage is 75.86%.

@@            Coverage Diff             @@
##            devel     #841      +/-   ##
==========================================
- Coverage   84.39%   83.93%   -0.47%     
==========================================
  Files          47       48       +1     
  Lines        4205     4433     +228     
  Branches      978     1044      +66     
==========================================
+ Hits         3549     3721     +172     
- Misses        423      458      +35     
- Partials      233      254      +21

Impacted Files	Coverage Δ
cobra/flux_analysis/helpers.py	`66.66% <100%> (+16.66%)`	⬆️
cobra/flux_analysis/loopless.py	`78.43% <67.1%> (-11.7%)`	⬇️
cobra/flux_analysis/find_active_reactions.py	`79.6% <79.6%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9d1987c...67f2863. Read the comment docs.

cdiener · 2019-04-19T17:37:27Z

It seems that you describe a full loop-removal procedure in the paper. If that does not have the same drawbacks as CycleFreeFlux it may be worthwhile to switch to yours. Could you use it for loopless FVA as well?

cdiener

Good style, some minor initial comments for now 😄

cdiener · 2019-04-19T17:41:54Z

cobra/flux_analysis/find_active_reactions.py

+
+def find_active_reactions(model, bigM=10000, zero_cutoff=None,
+                          relax_bounds=True, solve="lp", verbose=False):
+    """


Rather than using a verbose flag and print, use the logging module as done in other modules and output the additional messages with INFO or DEBUG. That allows user to set the verbosity globally rather than individually for each function.

Also the docstring is not in the right format. Needs a small description right after """".

cdiener · 2019-04-19T17:43:12Z

cobra/flux_analysis/find_active_reactions.py

+    else:
+        eps = normalize_cutoff(model, zero_cutoff)
+
+    eps = max(eps, model.solver.configuration.tolerances.feasibility * 100)


I think it may still be good to use `model.tolerance' here. Not all tolerances (integrality, feasibility, optimality) are implemented for all solvers (for instance GLPK has no optimality tolerance, they use different measures).

cdiener · 2019-04-19T17:45:07Z

cobra/flux_analysis/find_active_reactions.py

+
+
+def find_reactions_in_cycles(model, bigM=10000, zero_cutoff=1e-1,
+                             relax_bounds=True, solve="lp", verbose=False):


Since it is exported it should get a docstring.

cdiener · 2019-04-19T17:45:47Z

cobra/flux_analysis/find_active_reactions.py

+    return active_rxns
+
+
+def find_reactions_in_cycles(model, bigM=10000, zero_cutoff=1e-1,


Why does that zero_cutoff not default to model.tolerance as before?

cdiener · 2019-04-19T17:46:51Z

cobra/test/test_flux_analysis/test_find_active_reactions.py

+    large_model.solver = all_solvers
+    # solve LPs
+    rxns_in_cycles_lp = find_reactions_in_cycles(large_model)
+    # solving MILP or Fast-SNP may not be feasible for some solvers


You can make tests conditional on the available solvers. We can help you with that or you could look through some of the other tests for some examples.

Could you refer me to one of the examples? Have made other changes.

all_solvers is a fixture that will test with all available solvers. Should be good as is.

shjchan · 2019-04-19T21:08:51Z

It seems that you describe a full loop-removal procedure in the paper. If that does not have the same drawbacks as CycleFreeFlux it may be worthwhile to switch to yours. Could you use it for loopless FVA as well?

From what I understand, CycleFreeFlux is a post-optimization loop-removal procedure, which does not guarantee optimality. I guess it is particularly true for the reactions in loops. The method in my paper guarantee optimality as the original loop-law constraints and usually with much less binary variables, thus faster. The functions here are only for the first step of the whole localized-loopless-FVA procedure, i.e., to find a minimal feasible null space. The rest of the implementation will need some significant coding, which I hope I can do some time later.

shjchan · 2019-04-20T04:48:31Z

Any idea why the test is stuck at test_fastcc.py? I remember that when I just made the pull request, it ran through it.

Midnighter

I didn't have the time yet to go through the entire code. Out of curiosity, how is your algorithm affected by (how does it treat) reactions that are unbounded in one direction? Are they then also intended to be bounded by big M?

cobra/flux_analysis/find_active_reactions.py

Midnighter · 2019-04-20T16:08:14Z

cobra/flux_analysis/find_active_reactions.py

+
+    Notes
+    -----
+    The optmization problem solved is as follow:


It'd be really neat to use the Sphinx directives so that this is rendered as proper math. See here.

shjchan · 2019-04-20T22:41:56Z

I didn't have the time yet to go through the entire code. Out of curiosity, how is your algorithm affected by (how does it treat) reactions that are unbounded in one direction? Are they then also intended to be bounded by big M?

They are bounded by either big M (if relax_bounds=True) or the original bounds (if relax_bounds=False). The additional constraints for irreversible reactions are actually similar to fastcc while for the reversible reactions are a little different.

shjchan · 2019-04-21T05:01:58Z

It seems that you describe a full loop-removal procedure in the paper. If that does not have the same drawbacks as CycleFreeFlux it may be worthwhile to switch to yours. Could you use it for loopless FVA as well?

From what I understand, CycleFreeFlux is a post-optimization loop-removal procedure, which does not guarantee optimality. I guess it is particularly true for the reactions in loops. The method in my paper guarantee optimality as the original loop-law constraints and usually with much less binary variables, thus faster. The functions here are only for the first step of the whole localized-loopless-FVA procedure, i.e., to find a minimal feasible null space. The rest of the implementation will need some significant coding, which I hope I can do some time later.

Good news: the Fast-SNP which I also implemented in this pull request, can be readily used to largely reduce the time for loopless FVA already. I reorganized the newly added functions and added a method options to flux_variability_analysis (just a few lines). Tested on E. coli core model, results exactly the same. Tested on some reactions in iJO1366, around 40x time reduction (16xx sec --> 40 sec).

cdiener · 2019-04-23T17:33:09Z

Good news: the Fast-SNP which I also implemented in this pull request, can be readily used to largely reduce the time for loopless FVA already. I reorganized the newly added functions and added a method options to flux_variability_analysis (just a few lines). Tested on E. coli core model, results exactly the same. Tested on some reactions in iJO1366, around 40x time reduction (16xx sec --> 40 sec).

This sounds great. If I find the time I will give it a go with the problematic example from #698. If it handles that well I think the way to go would be to drop CycleFreeFlux in favor of your method.

shjchan · 2019-04-23T18:39:18Z

@cdiener great. I am pretty sure that it will work fine. By the way, if you have time, could you advise me what goes wrong in the test? It just froze after test_fastcc. I couldn't figure why. I don't think I have changed anything related to that. And I could finish the run if I ran pytest locally.

Midnighter

Still need to read your paper in detail but made some comments on the code already. Looks promising 😃

Midnighter · 2019-04-26T11:36:45Z

cobra/flux_analysis/find_active_reactions.py

+                switch_constrs.append(prob.Constraint(r.flux_expression +
+                                      coeff_neg * z_neg[r], ub=-eps))
+
+        model.add_cons_vars([z for r, z in z_pos.items()])


Suggested change

model.add_cons_vars([z for r, z in z_pos.items()])

model.add_cons_vars([z for z in z_pos.values()])

Midnighter · 2019-04-26T11:37:03Z

cobra/flux_analysis/find_active_reactions.py

+                                      coeff_neg * z_neg[r], ub=-eps))
+
+        model.add_cons_vars([z for r, z in z_pos.items()])
+        model.add_cons_vars([z for r, z in z_neg.items()])


Suggested change

model.add_cons_vars([z for r, z in z_neg.items()])

model.add_cons_vars([z for z in z_neg.values()])

Midnighter · 2019-04-26T11:37:59Z

cobra/flux_analysis/find_active_reactions.py

+        model.add_cons_vars([z for r, z in z_neg.items()])
+        model.add_cons_vars(switch_constrs)
+        model.objective = prob.Objective(Zero, sloppy=True, direction="min")
+        model.objective.set_linear_coefficients({z: 1.0 for r, z in


As above you can use the dict.values method.

Midnighter · 2019-04-26T11:38:12Z

cobra/flux_analysis/find_active_reactions.py

+        model.objective = prob.Objective(Zero, sloppy=True, direction="min")
+        model.objective.set_linear_coefficients({z: 1.0 for r, z in
+                                                z_pos.items()})
+        model.objective.set_linear_coefficients({z: 1.0 for r, z in


Midnighter · 2019-04-26T11:41:52Z

cobra/flux_analysis/find_active_reactions.py

+    if solve == "milp":
+        try:
+            # ensure bigM*z << eps at integrality tolerance limit
+            eps = max(eps, model.solver.configuration.tolerances.integrality *


Suggested change

eps = max(eps, model.solver.configuration.tolerances.integrality *

eps = max(eps, model.tolerance *

Thanks. But here it is important for eps to be at least 10x larger then the integrality tolerance if solving MILP. Otherwise there could be integrality violation to satisfy the constraint so that some reactions do not necessarily have flux.
And I was not sure if it is always defined for every solver (i.e. model.configurationas.tolerances.integrality is always an attribute), so I added try/except here hoping to see if that might help pass the test. But it was probably not the issue and is not needed, correct? Is model.tolerance not less than integrality tolerance if it is used? Any suggestion? Just want to make sure it is large enough when solving MILP and when integrality is used.

Ah, I see. We have simplified the tolerance interface in cobrapy. So setting the model.tolerance attempts to set all available tolerances given by the solver. Thus it is also expected that the model.tolerance value is the same one for integrality, feasiblity, etc. So yes, using model.tolerance should do the right thing and also not raise any exceptions.

Midnighter · 2019-04-26T11:50:07Z

cobra/flux_analysis/loopless.py

+            {r.reverse_variable: 1.0 for r in model.reactions})
+
+        iter = 0
+        while True:


This is a potentially infinite loop. It'd be safer to introduce a function parameter max_iterations (or similar name) with a default value and then run a loop instead.

Suggested change

while True:

for i in range(max_iterations):

Midnighter · 2019-04-26T11:52:16Z

cobra/flux_analysis/loopless.py

+            sol = model.optimize()
+
+            if sol.status == "optimal":
+                x = sol.fluxes.to_numpy()


Maybe the following also works rather than making a copy?

Suggested change

x = sol.fluxes.to_numpy()

x = sol.fluxes.values

Midnighter · 2019-04-26T11:52:36Z

cobra/flux_analysis/loopless.py

+            sol = model.optimize()
+
+            if sol.status == "optimal":
+                y = sol.fluxes.to_numpy()


Same as above.

Midnighter · 2019-04-26T11:53:12Z

cobra/flux_analysis/loopless.py

+            constr_proj.ub = bigM
+            constr_proj.lb = eps
+            constr_proj.ub = None
+            sol = model.optimize()


model.optimize() may return an OptimizationError, should that be handled?

Midnighter · 2019-04-26T11:59:11Z

cobra/test/test_flux_analysis/test_find_active_reactions.py

+    large_model.solver = all_solvers
+    # solve LPs
+    rxns_in_cycles_lp = find_reactions_in_cycles(large_model)
+    # solving MILP or Fast-SNP may not be feasible for some solvers


all_solvers is a fixture that will test with all available solvers. Should be good as is.

shjchan · 2019-05-01T15:28:30Z

I think I have addressed all comments and have restructured the code so that it is more pythonic.

cdiener · 2019-06-21T17:08:47Z

Sorry for the delay, just saw it and will go through it again. Will probably take me a little since it is a lot of code changes. From a quick look I don't see any changes to flux_variability_analysis as mentioned above and we will need more tests.

cdiener

Sorry for the long delay. Most of it being compliant with pep8 and numpy docstring format as well as rebasing on the latest devel branch. However, the method also does not seem to pass its own tests. Also the tests should also compare your implementation to the reference implementation (method="original" in add_loopless) for a smaller model.

Some functionality seems to overlap with the already existing fastcc function. In particular the find_active_reactions function. It seems to be very similar things.

However, in_cycles version looks very useful and it would be great to add it to FVA. Is it guranteed to remove all cycles from the solution?

I can help out with fixing the the style issues if you want but would leave the methods to you if that is all right.

cdiener · 2019-08-04T21:13:38Z

cobra/flux_analysis/loopless.py

+
+
+def fastSNP(model, bigM=1e4, zero_cutoff=None, eps=1e-3, N=None):
+    """


This should follow the numpy docstring format. So it needs a short imperative description after """.

cdiener · 2019-08-04T21:14:27Z

cobra/flux_analysis/loopless.py

+            x, y = None, None
+            try:
+                sol = model.optimize()
+            except OptimizationError:


OptimizationError has not been imported. Also the handling does not seem to be correct. If there is no feasible solution you should probably raise an error as well.

cdiener · 2019-08-04T21:17:55Z

cobra/flux_analysis/find_active_reactions.py

@@ -0,0 +1,486 @@
+
+"""
+Find all active reactions by solving a single MILP problem


Needs to specify what "active" means. There is already a find_blocked_reactions but this module does not seem to do the opposite (identifying any reactions that can carry flux).

cdiener · 2019-08-04T21:18:27Z

cobra/flux_analysis/find_active_reactions.py

+LOGGER = logging.getLogger(__name__)
+
+
+def find_active_reactions(model, bigM=10000, zero_cutoff=None,


Naming might be confusing. Maybe find_loopless_reactions or something.

cdiener · 2019-08-04T21:19:45Z

cobra/flux_analysis/find_active_reactions.py

+
+def find_active_reactions(model, bigM=10000, zero_cutoff=None,
+                          relax_bounds=True, solve="lp", verbose=False):
+    """


Also the docstring is not in the right format. Needs a small description right after """".

cdiener · 2019-08-04T21:22:55Z

cobra/flux_analysis/find_active_reactions.py

+    model.add_cons_vars(constr_min_flux)
+
+    n_lp_solved = 3
+    feas_tol = model.tolerance


Variable is never used.

cdiener · 2019-08-04T21:23:10Z

cobra/flux_analysis/find_active_reactions.py

+    constr_min_flux.lb = -min_flux
+    constr_min_flux.ub = -min_flux
+    constr_min_flux.lb = None
+    feas_tol = model.tolerance


Variable is never used.

cdiener · 2019-08-04T21:25:19Z

cobra/flux_analysis/loopless.py

+            constr_proj.lb = None
+            try:
+                sol = model.optimize()
+            except OptimizationError:


cdiener · 2019-08-04T21:30:00Z

cobra/flux_analysis/loopless.py

@@ -249,3 +270,156 @@ def loopless_fva_iter(model, reaction, solution=False, zero_cutoff=None):
            best = reaction.flux
    model.objective.direction = objective_dir
    return best
+
+
+def fastSNP(model, bigM=1e4, zero_cutoff=None, eps=1e-3, N=None):


I think this should also live in find_active_reactions.py.

cdiener · 2019-08-04T21:51:10Z

cobra/test/test_flux_analysis/test_find_active_reactions.py

+    benchmark(find_reactions_in_cycles, model)
+
+
+def test_find_active_reactions(model, all_solvers):


This test fails for me with all solvers:

def test_find_reactions_in_cycles(large_model, all_solvers): """Test find_reactions_in_cycles.""" large_model.solver = all_solvers # solve LPs rxns_in_cycles_lp = find_reactions_in_cycles(large_model) rxns_in_cycles = ['ABUTt2pp', 'ACCOAL', 'ACKr', 'ACS', 'ACt2rpp', 'ACt4pp', 'ADK1', 'ADK3', 'ADNt2pp', 'ADNt2rpp', 'ALATA_L', 'ALAt2pp', 'ALAt2rpp', 'ALAt4pp', 'ASPt2pp', 'ASPt2rpp', 'CA2t3pp', 'CAt6pp', 'CRNDt2rpp', 'CRNt2rpp', 'CRNt8pp', 'CYTDt2pp', 'CYTDt2rpp', 'FOMETRi', 'GLBRAN2', 'GLCP', 'GLCP2', 'GLCS1', 'GLCtex', 'GLCtexi', 'GLDBRAN2', 'GLGC', 'GLUABUTt7pp', 'GLUt2rpp', 'GLUt4pp', 'GLYCLTt2rpp', 'GLYCLTt4pp', 'GLYt2pp', 'GLYt2rpp', 'GLYt4pp', 'HPYRI', 'HPYRRx', 'ICHORS', 'ICHORSi', 'INDOLEt2pp', 'INDOLEt2rpp', 'INSt2pp', 'INSt2rpp', 'NAt3pp', 'NDPK1', 'PPAKr', 'PPCSCT', 'PPKr', 'PPM', 'PROt2rpp', 'PROt4pp', 'PRPPS', 'PTA2', 'PTAr', 'R15BPK', 'R1PK', 'SERt2rpp', 'SERt4pp', 'SUCOAS', 'THFAT', 'THMDt2pp', 'THMDt2rpp', 'THRt2rpp', 'THRt4pp', 'TRSARr', 'URAt2pp', 'URAt2rpp', 'URIt2pp', 'URIt2rpp', 'VALTA', 'VPAMTr'] > assert set(rxns_in_cycles_lp) == set(rxns_in_cycles) E AssertionError: assert {'ABUTt2pp', ...'ACt4pp', ...} == {'ABUTt2pp', '...'ACt4pp', ...} E Extra items in the left set: E 'ICHORS_copy1' E 'INSt2pp_copy2' E 'ALAt2pp_copy1' E 'ICHORS_copy2' E 'GLYt2pp_copy2' E 'GLCtex_copy2'... E E ...Full output truncated (43 lines hidden), use '-vv' to show

cdiener requested changes Apr 19, 2019

View reviewed changes

cdiener added the WIP work in progress label Apr 19, 2019

Midnighter reviewed Apr 20, 2019

View reviewed changes

Midnighter requested changes Apr 26, 2019

View reviewed changes

shjchan added 17 commits May 1, 2019 02:13

Add find_active_reactions.py and test

2051105

Simplify test

b2b30f5

Small fix

2ec5247

Use logger instead of print

a964fde

Change zero_cutoff

f8f8b1d

Add documentation for find_reactions_in_cycles

c4a83ec

Fix math in doc, reorganize files and add loopless FVA options

1a6beef

Add doc for new options in add_loopless

bf4a04c

Fix small bugs, update docs and __init__.py

6b0526a

Use np.ndarray instead matrix

7b6ac4d

Update test_find_active_reactions.py

f1d5983

Add exception for OptimizationError

a02539f

Make the program flow more pythonic

315e286

Handle unexpected zero optimal solutions

0e21250

Handle empty nullspace

ee03892

Test also solving MILP and Fast-SNP

410cc68

Fix how tolerance is used

900f132

shjchan force-pushed the implement_find_act_rxns branch from 3ef066f to 900f132 Compare May 1, 2019 08:14

shjchan closed this May 1, 2019

shjchan reopened this May 1, 2019

Fix formatting

67f2863

cdiener self-assigned this Jun 21, 2019

cdiener added ready Finished PR that requires review and merge. and removed WIP work in progress labels Jun 21, 2019

cdiener requested changes Aug 4, 2019

View reviewed changes

cdiener removed the ready Finished PR that requires review and merge. label Aug 4, 2019

cdiener added the WIP work in progress label Mar 13, 2020

Midnighter added the stale The issue or pull request lacks activity. label Jul 16, 2020



		def find_reactions_in_cycles(model, bigM=10000, zero_cutoff=1e-1,
		relax_bounds=True, solve="lp", verbose=False):

		return active_rxns


		def find_reactions_in_cycles(model, bigM=10000, zero_cutoff=1e-1,

	model.add_cons_vars([z for r, z in z_pos.items()])
	model.add_cons_vars([z for z in z_pos.values()])

	model.add_cons_vars([z for r, z in z_neg.items()])
	model.add_cons_vars([z for z in z_neg.values()])

	eps = max(eps, model.solver.configuration.tolerances.integrality *
	eps = max(eps, model.tolerance *



		def fastSNP(model, bigM=1e4, zero_cutoff=None, eps=1e-3, N=None):
		"""

		@@ -0,0 +1,486 @@

		"""
		Find all active reactions by solving a single MILP problem

		LOGGER = logging.getLogger(__name__)


		def find_active_reactions(model, bigM=10000, zero_cutoff=None,

		benchmark(find_reactions_in_cycles, model)


		def test_find_active_reactions(model, all_solvers):

Add find_active_reactions.py to find active reactions quicker #841

Are you sure you want to change the base?

Add find_active_reactions.py to find active reactions quicker #841

Conversation

shjchan commented Apr 19, 2019

codecov-io commented Apr 19, 2019 • edited Loading

Codecov Report

cdiener commented Apr 19, 2019

cdiener left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

shjchan commented Apr 19, 2019 • edited Loading

shjchan commented Apr 20, 2019

Midnighter left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

shjchan commented Apr 20, 2019 • edited Loading

shjchan commented Apr 21, 2019

cdiener commented Apr 23, 2019

shjchan commented Apr 23, 2019

Midnighter left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

shjchan commented May 1, 2019

cdiener commented Jun 21, 2019

cdiener left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov-io commented Apr 19, 2019 •

edited

Loading

shjchan commented Apr 19, 2019 •

edited

Loading

shjchan commented Apr 20, 2019 •

edited

Loading

cdiener left a comment •

edited

Loading