Cargando…

Finding orthologous gene blocks in bacteria: the computational hardness of the problem and novel methods to address it

MOTIVATION: The evolution of complexity is one of the most fascinating and challenging problems in modern biology, and tracing the evolution of complex traits is an open problem. In bacteria, operons and gene blocks provide a model of tractable evolutionary complexity at the genomic level. Gene bloc...

Descripción completa

Detalles Bibliográficos
Autores principales: Nguyen, Huy N, Markin, Alexey, Friedberg, Iddo, Eulenstein, Oliver
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7773486/
https://www.ncbi.nlm.nih.gov/pubmed/33381825
http://dx.doi.org/10.1093/bioinformatics/btaa794
Descripción
Sumario:MOTIVATION: The evolution of complexity is one of the most fascinating and challenging problems in modern biology, and tracing the evolution of complex traits is an open problem. In bacteria, operons and gene blocks provide a model of tractable evolutionary complexity at the genomic level. Gene blocks are structures of co-located genes with related functions, and operons are gene blocks whose genes are co-transcribed on a single mRNA molecule. The genes in operons and gene blocks typically work together in the same system or molecular complex. Previously, we proposed a method that explains the evolution of orthologous gene blocks (orthoblocks) as a combination of a small set of events that take place in vertical evolution from common ancestors. A heuristic method was proposed to solve this problem. However, no study was done to identify the complexity of the problem. RESULTS: Here, we establish that finding the homologous gene block problem is NP-hard and APX-hard. We have developed a greedy algorithm that runs in polynomial time and guarantees an [Formula: see text] approximation. In addition, we formalize our problem as an integer linear program problem and solve it using the PuLP package and the standard CPLEX algorithm. Our exploration of several candidate operons reveals that our new method provides more optimal results than the results from the heuristic approach, and is significantly faster. AVAILABILITY AND IMPLEMENTATION: The software and data accompanying this paper are available under the GPLv3 and CC0 license respectively on: https://github.com/nguyenngochuy91/Relevant-Operon.