Cargando…

Efficient algorithms for reconstructing gene content by co-evolution

BACKGROUND: In a previous study we demonstrated that co-evolutionary information can be utilized for improving the accuracy of ancestral gene content reconstruction. To this end, we defined a new computational problem, the Ancestral Co-Evolutionary (ACE) problem, and developed algorithms for solving...

Descripción completa

Detalles Bibliográficos
Autores principales:	Birin, Hadas, Tuller, Tamir
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2011
Materias:	Proceedings
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3283311/ https://www.ncbi.nlm.nih.gov/pubmed/22151715 http://dx.doi.org/10.1186/1471-2105-12-S9-S12

_version_	1782224182513238016
author	Birin, Hadas Tuller, Tamir
author_facet	Birin, Hadas Tuller, Tamir
author_sort	Birin, Hadas
collection	PubMed
description	BACKGROUND: In a previous study we demonstrated that co-evolutionary information can be utilized for improving the accuracy of ancestral gene content reconstruction. To this end, we defined a new computational problem, the Ancestral Co-Evolutionary (ACE) problem, and developed algorithms for solving it. RESULTS: In the current paper we generalize our previous study in various ways. First, we describe new efficient computational approaches for solving the ACE problem. The new approaches are based on reductions to classical methods such as linear programming relaxation, quadratic programming, and min-cut. Second, we report new computational hardness results related to the ACE, including practical cases where it can be solved in polynomial time. Third, we generalize the ACE problem and demonstrate how our approach can be used for inferring parts of the genomes of non-ancestral organisms. To this end, we describe a heuristic for finding the portion of the genome ('dominant set’) that can be used to reconstruct the rest of the genome with the lowest error rate. This heuristic utilizes both evolutionary information and co-evolutionary information. We implemented these algorithms on a large input of the ACE problem (95 unicellular organisms, 4,873 protein families, and 10, 576 of co-evolutionary relations), demonstrating that some of these algorithms can outperform the algorithm used in our previous study. In addition, we show that based on our approach a ’dominant set’ cab be used reconstruct a major fraction of a genome (up to 79%) with relatively low error-rate (e.g. 0.11). We find that the ’dominant set’ tends to include metabolic and regulatory genes, with high evolutionary rate, and low protein abundance and number of protein-protein interactions. CONCLUSIONS: The ACE problem can be efficiently extended for inferring the genomes of organisms that exist today. In addition, it may be solved in polynomial time in many practical cases. Metabolic and regulatory genes were found to be the most important groups of genes necessary for reconstructing gene content of an organism based on other related genomes.
format	Online Article Text
id	pubmed-3283311
institution	National Center for Biotechnology Information
language	English
publishDate	2011
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-32833112012-02-22 Efficient algorithms for reconstructing gene content by co-evolution Birin, Hadas Tuller, Tamir BMC Bioinformatics Proceedings BACKGROUND: In a previous study we demonstrated that co-evolutionary information can be utilized for improving the accuracy of ancestral gene content reconstruction. To this end, we defined a new computational problem, the Ancestral Co-Evolutionary (ACE) problem, and developed algorithms for solving it. RESULTS: In the current paper we generalize our previous study in various ways. First, we describe new efficient computational approaches for solving the ACE problem. The new approaches are based on reductions to classical methods such as linear programming relaxation, quadratic programming, and min-cut. Second, we report new computational hardness results related to the ACE, including practical cases where it can be solved in polynomial time. Third, we generalize the ACE problem and demonstrate how our approach can be used for inferring parts of the genomes of non-ancestral organisms. To this end, we describe a heuristic for finding the portion of the genome ('dominant set’) that can be used to reconstruct the rest of the genome with the lowest error rate. This heuristic utilizes both evolutionary information and co-evolutionary information. We implemented these algorithms on a large input of the ACE problem (95 unicellular organisms, 4,873 protein families, and 10, 576 of co-evolutionary relations), demonstrating that some of these algorithms can outperform the algorithm used in our previous study. In addition, we show that based on our approach a ’dominant set’ cab be used reconstruct a major fraction of a genome (up to 79%) with relatively low error-rate (e.g. 0.11). We find that the ’dominant set’ tends to include metabolic and regulatory genes, with high evolutionary rate, and low protein abundance and number of protein-protein interactions. CONCLUSIONS: The ACE problem can be efficiently extended for inferring the genomes of organisms that exist today. In addition, it may be solved in polynomial time in many practical cases. Metabolic and regulatory genes were found to be the most important groups of genes necessary for reconstructing gene content of an organism based on other related genomes. BioMed Central 2011-10-05 /pmc/articles/PMC3283311/ /pubmed/22151715 http://dx.doi.org/10.1186/1471-2105-12-S9-S12 Text en Copyright ©2011 Birin and Tuller; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Proceedings Birin, Hadas Tuller, Tamir Efficient algorithms for reconstructing gene content by co-evolution
title	Efficient algorithms for reconstructing gene content by co-evolution
title_full	Efficient algorithms for reconstructing gene content by co-evolution
title_fullStr	Efficient algorithms for reconstructing gene content by co-evolution
title_full_unstemmed	Efficient algorithms for reconstructing gene content by co-evolution
title_short	Efficient algorithms for reconstructing gene content by co-evolution
title_sort	efficient algorithms for reconstructing gene content by co-evolution
topic	Proceedings
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3283311/ https://www.ncbi.nlm.nih.gov/pubmed/22151715 http://dx.doi.org/10.1186/1471-2105-12-S9-S12
work_keys_str_mv	AT birinhadas efficientalgorithmsforreconstructinggenecontentbycoevolution AT tullertamir efficientalgorithmsforreconstructinggenecontentbycoevolution

Efficient algorithms for reconstructing gene content by co-evolution

Ejemplares similares