Cargando…

Efficient Minimum Flow Decomposition via Integer Linear Programming

Minimum flow decomposition (MFD) is an NP-hard problem asking to decompose a network flow into a minimum set of paths (together with associated weights). Variants of it are powerful models in multiassembly problems in Bioinformatics, such as RNA assembly. Owing to its hardness, practical multiassemb...

Descripción completa

Detalles Bibliográficos
Autores principales: Dias, Fernando H.C., Williams, Lucia, Mumey, Brendan, Tomescu, Alexandru I.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Mary Ann Liebert, Inc., publishers 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9700332/
https://www.ncbi.nlm.nih.gov/pubmed/36260412
http://dx.doi.org/10.1089/cmb.2022.0257
_version_ 1784839288153702400
author Dias, Fernando H.C.
Williams, Lucia
Mumey, Brendan
Tomescu, Alexandru I.
author_facet Dias, Fernando H.C.
Williams, Lucia
Mumey, Brendan
Tomescu, Alexandru I.
author_sort Dias, Fernando H.C.
collection PubMed
description Minimum flow decomposition (MFD) is an NP-hard problem asking to decompose a network flow into a minimum set of paths (together with associated weights). Variants of it are powerful models in multiassembly problems in Bioinformatics, such as RNA assembly. Owing to its hardness, practical multiassembly tools either use heuristics or solve simpler, polynomial time-solvable versions of the problem, which may yield solutions that are not minimal or do not perfectly decompose the flow. Here, we provide the first fast and exact solver for MFD on acyclic flow networks, based on Integer Linear Programming (ILP). Key to our approach is an encoding of all the exponentially many solution paths using only a quadratic number of variables. We also extend our ILP formulation to many practical variants, such as incorporating longer or paired-end reads, or minimizing flow errors. On both simulated and real-flow splicing graphs, our approach solves any instance in <13 seconds. We hope that our formulations can lie at the core of future practical RNA assembly tools. Our implementations are freely available on Github.
format Online
Article
Text
id pubmed-9700332
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Mary Ann Liebert, Inc., publishers
record_format MEDLINE/PubMed
spelling pubmed-97003322023-11-01 Efficient Minimum Flow Decomposition via Integer Linear Programming Dias, Fernando H.C. Williams, Lucia Mumey, Brendan Tomescu, Alexandru I. J Comput Biol Research Articles Minimum flow decomposition (MFD) is an NP-hard problem asking to decompose a network flow into a minimum set of paths (together with associated weights). Variants of it are powerful models in multiassembly problems in Bioinformatics, such as RNA assembly. Owing to its hardness, practical multiassembly tools either use heuristics or solve simpler, polynomial time-solvable versions of the problem, which may yield solutions that are not minimal or do not perfectly decompose the flow. Here, we provide the first fast and exact solver for MFD on acyclic flow networks, based on Integer Linear Programming (ILP). Key to our approach is an encoding of all the exponentially many solution paths using only a quadratic number of variables. We also extend our ILP formulation to many practical variants, such as incorporating longer or paired-end reads, or minimizing flow errors. On both simulated and real-flow splicing graphs, our approach solves any instance in <13 seconds. We hope that our formulations can lie at the core of future practical RNA assembly tools. Our implementations are freely available on Github. Mary Ann Liebert, Inc., publishers 2022-11-01 2022-11-08 /pmc/articles/PMC9700332/ /pubmed/36260412 http://dx.doi.org/10.1089/cmb.2022.0257 Text en © Fernando H.C. Dias, et al., 2022. Published by Mary Ann Liebert, Inc. https://creativecommons.org/licenses/by/4.0/This Open Access article is distributed under the terms of the Creative Commons License (http://creativecommons.org/licenses/by/4.0 (https://creativecommons.org/licenses/by/4.0/) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited.
spellingShingle Research Articles
Dias, Fernando H.C.
Williams, Lucia
Mumey, Brendan
Tomescu, Alexandru I.
Efficient Minimum Flow Decomposition via Integer Linear Programming
title Efficient Minimum Flow Decomposition via Integer Linear Programming
title_full Efficient Minimum Flow Decomposition via Integer Linear Programming
title_fullStr Efficient Minimum Flow Decomposition via Integer Linear Programming
title_full_unstemmed Efficient Minimum Flow Decomposition via Integer Linear Programming
title_short Efficient Minimum Flow Decomposition via Integer Linear Programming
title_sort efficient minimum flow decomposition via integer linear programming
topic Research Articles
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9700332/
https://www.ncbi.nlm.nih.gov/pubmed/36260412
http://dx.doi.org/10.1089/cmb.2022.0257
work_keys_str_mv AT diasfernandohc efficientminimumflowdecompositionviaintegerlinearprogramming
AT williamslucia efficientminimumflowdecompositionviaintegerlinearprogramming
AT mumeybrendan efficientminimumflowdecompositionviaintegerlinearprogramming
AT tomescualexandrui efficientminimumflowdecompositionviaintegerlinearprogramming