Cargando…

Finding orthologous gene blocks in bacteria: the computational hardness of the problem and novel methods to address it

MOTIVATION: The evolution of complexity is one of the most fascinating and challenging problems in modern biology, and tracing the evolution of complex traits is an open problem. In bacteria, operons and gene blocks provide a model of tractable evolutionary complexity at the genomic level. Gene bloc...

Descripción completa

Detalles Bibliográficos
Autores principales: Nguyen, Huy N, Markin, Alexey, Friedberg, Iddo, Eulenstein, Oliver
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7773486/
https://www.ncbi.nlm.nih.gov/pubmed/33381825
http://dx.doi.org/10.1093/bioinformatics/btaa794
_version_ 1783630055480492032
author Nguyen, Huy N
Markin, Alexey
Friedberg, Iddo
Eulenstein, Oliver
author_facet Nguyen, Huy N
Markin, Alexey
Friedberg, Iddo
Eulenstein, Oliver
author_sort Nguyen, Huy N
collection PubMed
description MOTIVATION: The evolution of complexity is one of the most fascinating and challenging problems in modern biology, and tracing the evolution of complex traits is an open problem. In bacteria, operons and gene blocks provide a model of tractable evolutionary complexity at the genomic level. Gene blocks are structures of co-located genes with related functions, and operons are gene blocks whose genes are co-transcribed on a single mRNA molecule. The genes in operons and gene blocks typically work together in the same system or molecular complex. Previously, we proposed a method that explains the evolution of orthologous gene blocks (orthoblocks) as a combination of a small set of events that take place in vertical evolution from common ancestors. A heuristic method was proposed to solve this problem. However, no study was done to identify the complexity of the problem. RESULTS: Here, we establish that finding the homologous gene block problem is NP-hard and APX-hard. We have developed a greedy algorithm that runs in polynomial time and guarantees an [Formula: see text] approximation. In addition, we formalize our problem as an integer linear program problem and solve it using the PuLP package and the standard CPLEX algorithm. Our exploration of several candidate operons reveals that our new method provides more optimal results than the results from the heuristic approach, and is significantly faster. AVAILABILITY AND IMPLEMENTATION: The software and data accompanying this paper are available under the GPLv3 and CC0 license respectively on: https://github.com/nguyenngochuy91/Relevant-Operon.
format Online
Article
Text
id pubmed-7773486
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-77734862021-01-05 Finding orthologous gene blocks in bacteria: the computational hardness of the problem and novel methods to address it Nguyen, Huy N Markin, Alexey Friedberg, Iddo Eulenstein, Oliver Bioinformatics Genomes MOTIVATION: The evolution of complexity is one of the most fascinating and challenging problems in modern biology, and tracing the evolution of complex traits is an open problem. In bacteria, operons and gene blocks provide a model of tractable evolutionary complexity at the genomic level. Gene blocks are structures of co-located genes with related functions, and operons are gene blocks whose genes are co-transcribed on a single mRNA molecule. The genes in operons and gene blocks typically work together in the same system or molecular complex. Previously, we proposed a method that explains the evolution of orthologous gene blocks (orthoblocks) as a combination of a small set of events that take place in vertical evolution from common ancestors. A heuristic method was proposed to solve this problem. However, no study was done to identify the complexity of the problem. RESULTS: Here, we establish that finding the homologous gene block problem is NP-hard and APX-hard. We have developed a greedy algorithm that runs in polynomial time and guarantees an [Formula: see text] approximation. In addition, we formalize our problem as an integer linear program problem and solve it using the PuLP package and the standard CPLEX algorithm. Our exploration of several candidate operons reveals that our new method provides more optimal results than the results from the heuristic approach, and is significantly faster. AVAILABILITY AND IMPLEMENTATION: The software and data accompanying this paper are available under the GPLv3 and CC0 license respectively on: https://github.com/nguyenngochuy91/Relevant-Operon. Oxford University Press 2020-12-29 /pmc/articles/PMC7773486/ /pubmed/33381825 http://dx.doi.org/10.1093/bioinformatics/btaa794 Text en © The Author(s) 2020. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) ), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Genomes
Nguyen, Huy N
Markin, Alexey
Friedberg, Iddo
Eulenstein, Oliver
Finding orthologous gene blocks in bacteria: the computational hardness of the problem and novel methods to address it
title Finding orthologous gene blocks in bacteria: the computational hardness of the problem and novel methods to address it
title_full Finding orthologous gene blocks in bacteria: the computational hardness of the problem and novel methods to address it
title_fullStr Finding orthologous gene blocks in bacteria: the computational hardness of the problem and novel methods to address it
title_full_unstemmed Finding orthologous gene blocks in bacteria: the computational hardness of the problem and novel methods to address it
title_short Finding orthologous gene blocks in bacteria: the computational hardness of the problem and novel methods to address it
title_sort finding orthologous gene blocks in bacteria: the computational hardness of the problem and novel methods to address it
topic Genomes
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7773486/
https://www.ncbi.nlm.nih.gov/pubmed/33381825
http://dx.doi.org/10.1093/bioinformatics/btaa794
work_keys_str_mv AT nguyenhuyn findingorthologousgeneblocksinbacteriathecomputationalhardnessoftheproblemandnovelmethodstoaddressit
AT markinalexey findingorthologousgeneblocksinbacteriathecomputationalhardnessoftheproblemandnovelmethodstoaddressit
AT friedbergiddo findingorthologousgeneblocksinbacteriathecomputationalhardnessoftheproblemandnovelmethodstoaddressit
AT eulensteinoliver findingorthologousgeneblocksinbacteriathecomputationalhardnessoftheproblemandnovelmethodstoaddressit