Gapfilling fails in step 1 #232

cdiener · 2024-08-28T07:06:27Z

Related to #228, opening a new bug since I can't reopen the old one and this is a new example.

We are still seeing failures in gapfilling for Archaea with version 1.3.1. I attached an example from MGnify here.

Environment

From here.

gapseq version: 1.3.1
Sequence DB md5sum: 17e92c9 (2023-12-12, Bacteria)
Sequence DB md5sum: 8bb9575 (2023-12-12, Archaea)

Commands

We use the gut medium bundled with gapseq. The script we run:

#!/bin/bash -ue
cp /home/isilon/users/o_diener/micom_dbs/recipes/uhgg/data/gut.csv medium.csv
gapseq fill -m MGYG000000522-draft.RDS -n medium.csv -c MGYG000000522-rxnWeights.RDS -b 100 -g MGYG000000522-rxnXgenes.RDS > MGYG000000522.log
gzip MGYG000000522.xml

Which will result in:

INFO [2024-08-27 15:30:20] Set BLAS to 1 thread.                                                                                         
Loading required package: glpkAPI                                                                                                        
using GLPK version 5.0                                                                                                                   
Warning message:                                                                                                                         
glpkAPI is used but cplexAPI is recommended because it is much faster                                                                    
Warning in gapfill4(mod.orig = mod.orig, mod.full = mod, rxn.weights = copy(rxn.weights),  :                                             
  Final model MGYG000000522 cannot produce enough target even when all candidate reactions are added! obj=0 lp_stat=5                    
Wrote file ./MGYG000000522.xml

MRP

See the input files here: uhgg_example.zip

The text was updated successfully, but these errors were encountered:

- Option for minimum required growth rate was already part of the "gapseq fill" module. - However, growth rate was not always achieved, since it was not added as constraint in gapfill step 1. - refers to #232

Waschina · 2024-08-28T14:35:03Z

Thanks again for posting the issue.

I tried to find the underlying problem but am not entirely there yet. What I found so far:

the issue is unrelated to Gapfilling occasionally fails silently with GLPK #228 in the sense that here, the issue occurs also with cplex as the solver.
In the MRP, an infeasible flux distribution is predicted in the first gapfilling step, where all reactions from the biochemistry database are considered. However, both GLPK and CPLEX returned the solution as feasible. When reducing the metabolic network to those reactions predicted to carry a non-zero flux, the FBA predicts that no biomass formation is possible.
Unrelated to that, I noticed that we have an option "--min.obj.val" in the gapseq fill module to set a minimum growth rate that gap-filling should achieve; yet, it was not enforced, and models with a growth rate less than the specified value are commonly returned. I fixed this with commit fb7fbcb. The example with MGYG000000522 now works, but the issue from the previous point remains.

cdiener · 2024-08-29T08:19:37Z

Do you mean the initial MIP is feasible but then the model with non-zero indicators is infeasible?
We had similar issues with the gapfilling in cobrapy and what helped is to lower the integrality tolerance because the default will sometimes allow flux through zero indicator reactions otherwise. That happened with GLPK and CPLEX to us as well. Gurobi and HIGHS seem to be ab it less sensitive there.

Waschina · 2024-08-29T08:46:42Z

Do you mean the initial MIP is feasible but then the model with non-zero indicators is infeasible?

Yes, just that in gapseq the initial problem is formulated as LP not as MIP (Eq. 1 in the gapseq publication).

We had similar issues with the gapfilling in cobrapy and what helped is to lower the integrality tolerance because the default will sometimes allow flux through zero indicator reactions otherwise. That happened with GLPK and CPLEX to us as well. Gurobi and HIGHS seem to be ab it less sensitive there.

That is good to know; thank you! I will try to lower tolerance in glpk and cplex.

cdiener · 2024-08-29T08:49:49Z

Oh sorry, never mind then. Integrality tolerance has no effects on LPs, obviously.

Waschina · 2024-08-29T09:36:37Z

But it was a good hint – there is a simplex parameter in GLPK that controls tolerance for bound violations. I guess cplex has this, too. A first quick test with GLPK indicates that this value might need to be reduced for the initial LP. I'll do some further tests.

- Code was slightly modified to make sure that the initial gap-fill pFBA solution is actually a feasible solution. - In case infeasible solutions are returned, the LP is solved again with lowered bound violation tolerance (refers to #232). Some further changes: - During gapfilling, the reported numbers of added reactions now does not count added exchange reactions - more conistent checking of a returned FBA solution is actually feasible

Waschina · 2024-08-30T13:32:31Z

The gapfilling algorithm was updated on the cobrar branch. The key was indeed to reduce the bound violation tolerance in cases where an optimal solution returned by the initial pFBA was not feasible and then re-run the optimization.
The CPLEX documentation also recommends this procedure:

You can also lower this tolerance after finding an optimal solution if there is any doubt that the solution is truly optimal.

"cobrar" needs to be updated to the current development version to work with the latest version of gapseq from the cobrar branch,

I will run a larger set of reconstructions with the new updates to see if something unexpected happens.

cdiener · 2024-09-02T14:04:55Z

Okay that makes sense. When we had similar problems in the CORDA port (somewhat similar idea with a linearized cost optimization) another gotcha was the threshold for the absolute flux value (`|v| > eps -> include). But it looks like you are already checking that by removing the reaction and ensuring the objective is maintained.

cdiener · 2024-09-04T07:05:04Z

Just as an FYI. For me, the provided example will still fail gapfilling silently (final growth rate of 0) evne when updating to the latest master branch.

Waschina · 2024-09-04T13:40:30Z

Yes, the master branch still relies on the sybil packages. Since the fix involved quite some changes in the gapfilling algorithm, I just made the changes to the "cobrar" branch of gapseq, which will be merged into the master branch hopefully soon.

cdiener · 2024-09-06T10:45:34Z

Sorry I meant in relation to:

Unrelated to that, I noticed that we have an option "--min.obj.val" in the gapseq fill module to set a minimum growth rate that gap-filling should achieve; yet, it was not enforced, and models with a growth rate less than the specified value are commonly returned. I fixed this with commit fb7fbcb. The example with MGYG000000522 now works, but the issue from the previous point remains.

Even with commit fb7f MGYG000000522 did not work for me.

I will rebuild the image with the cobrar branch and test that one a bit more. Thanks for all the work on this!

Repository owner deleted a comment Aug 28, 2024

Waschina self-assigned this Aug 28, 2024

Repository owner deleted a comment Aug 28, 2024

Waschina mentioned this issue Aug 28, 2024

Enforce minimum required growth rate in gapfill step 1 #233

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Gapfilling fails in step 1 #232

Gapfilling fails in step 1 #232

cdiener commented Aug 28, 2024

Waschina commented Aug 28, 2024

cdiener commented Aug 29, 2024

Waschina commented Aug 29, 2024

cdiener commented Aug 29, 2024

Waschina commented Aug 29, 2024

Waschina commented Aug 30, 2024 •

edited

Loading

cdiener commented Sep 2, 2024

cdiener commented Sep 4, 2024

Waschina commented Sep 4, 2024

cdiener commented Sep 6, 2024

Gapfilling fails in step 1 #232

Gapfilling fails in step 1 #232

Comments

cdiener commented Aug 28, 2024

Environment

Commands

MRP

Waschina commented Aug 28, 2024

cdiener commented Aug 29, 2024

Waschina commented Aug 29, 2024

cdiener commented Aug 29, 2024

Waschina commented Aug 29, 2024

Waschina commented Aug 30, 2024 • edited Loading

cdiener commented Sep 2, 2024

cdiener commented Sep 4, 2024

Waschina commented Sep 4, 2024

cdiener commented Sep 6, 2024

Waschina commented Aug 30, 2024 •

edited

Loading