Cargando…

Co-evolutionary Models for Reconstructing Ancestral Genomic Sequences: Computational Issues and Biological Examples

The inference of ancestral genomes is a fundamental problem in molecular evolution. Due to the statistical nature of this problem, the most likely or the most parsimonious ancestral genomes usually include considerable error rates. In general, these errors cannot be abolished by utilizing more exhau...

Descripción completa

Detalles Bibliográficos
Autores principales: Tuller, Tamir, Birin, Hadas, Kupiec, Martin, Ruppin, Eytan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7120581/
http://dx.doi.org/10.1007/978-3-642-04744-2_14
_version_ 1783515004658515968
author Tuller, Tamir
Birin, Hadas
Kupiec, Martin
Ruppin, Eytan
author_facet Tuller, Tamir
Birin, Hadas
Kupiec, Martin
Ruppin, Eytan
author_sort Tuller, Tamir
collection PubMed
description The inference of ancestral genomes is a fundamental problem in molecular evolution. Due to the statistical nature of this problem, the most likely or the most parsimonious ancestral genomes usually include considerable error rates. In general, these errors cannot be abolished by utilizing more exhaustive computational approaches, by using longer genomic sequences, or by analyzing more taxa. In recent studies we showed that co-evolution is an important force that can be used for significantly improving the inference of ancestral genome content. In this work we formally define a computational problem for the inference of ancestral genome content by co-evolution. We show that this problem is NP-hard and present both a Fixed Parameter Tractable (FPT) algorithm, and heuristic approximation algorithms for solving it. The running time of these algorithms on simulated inputs with hundreds of protein families and hundreds of co-evolutionary relations was fast (up to four minutes) and it achieved an approximation ratio < 1.3. We use our approach to study the ancestral genome content of the Fungi. To this end, we implement our approach on a dataset of 33,931 protein families and 20,317 co-evolutionary relations. Our algorithm added and removed hundreds of proteins from the ancestral genomes inferred by maximum likelihood (ML) or maximum parsimony (MP) while slightly affecting the likelihood/parsimony score of the results. A biological analysis revealed various pieces of evidence that support the biological plausibility of the new solutions.
format Online
Article
Text
id pubmed-7120581
institution National Center for Biotechnology Information
language English
publishDate 2009
record_format MEDLINE/PubMed
spelling pubmed-71205812020-04-06 Co-evolutionary Models for Reconstructing Ancestral Genomic Sequences: Computational Issues and Biological Examples Tuller, Tamir Birin, Hadas Kupiec, Martin Ruppin, Eytan Comparative Genomics Article The inference of ancestral genomes is a fundamental problem in molecular evolution. Due to the statistical nature of this problem, the most likely or the most parsimonious ancestral genomes usually include considerable error rates. In general, these errors cannot be abolished by utilizing more exhaustive computational approaches, by using longer genomic sequences, or by analyzing more taxa. In recent studies we showed that co-evolution is an important force that can be used for significantly improving the inference of ancestral genome content. In this work we formally define a computational problem for the inference of ancestral genome content by co-evolution. We show that this problem is NP-hard and present both a Fixed Parameter Tractable (FPT) algorithm, and heuristic approximation algorithms for solving it. The running time of these algorithms on simulated inputs with hundreds of protein families and hundreds of co-evolutionary relations was fast (up to four minutes) and it achieved an approximation ratio < 1.3. We use our approach to study the ancestral genome content of the Fungi. To this end, we implement our approach on a dataset of 33,931 protein families and 20,317 co-evolutionary relations. Our algorithm added and removed hundreds of proteins from the ancestral genomes inferred by maximum likelihood (ML) or maximum parsimony (MP) while slightly affecting the likelihood/parsimony score of the results. A biological analysis revealed various pieces of evidence that support the biological plausibility of the new solutions. 2009 /pmc/articles/PMC7120581/ http://dx.doi.org/10.1007/978-3-642-04744-2_14 Text en © Springer-Verlag Berlin Heidelberg 2009 This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic.
spellingShingle Article
Tuller, Tamir
Birin, Hadas
Kupiec, Martin
Ruppin, Eytan
Co-evolutionary Models for Reconstructing Ancestral Genomic Sequences: Computational Issues and Biological Examples
title Co-evolutionary Models for Reconstructing Ancestral Genomic Sequences: Computational Issues and Biological Examples
title_full Co-evolutionary Models for Reconstructing Ancestral Genomic Sequences: Computational Issues and Biological Examples
title_fullStr Co-evolutionary Models for Reconstructing Ancestral Genomic Sequences: Computational Issues and Biological Examples
title_full_unstemmed Co-evolutionary Models for Reconstructing Ancestral Genomic Sequences: Computational Issues and Biological Examples
title_short Co-evolutionary Models for Reconstructing Ancestral Genomic Sequences: Computational Issues and Biological Examples
title_sort co-evolutionary models for reconstructing ancestral genomic sequences: computational issues and biological examples
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7120581/
http://dx.doi.org/10.1007/978-3-642-04744-2_14
work_keys_str_mv AT tullertamir coevolutionarymodelsforreconstructingancestralgenomicsequencescomputationalissuesandbiologicalexamples
AT birinhadas coevolutionarymodelsforreconstructingancestralgenomicsequencescomputationalissuesandbiologicalexamples
AT kupiecmartin coevolutionarymodelsforreconstructingancestralgenomicsequencescomputationalissuesandbiologicalexamples
AT ruppineytan coevolutionarymodelsforreconstructingancestralgenomicsequencescomputationalissuesandbiologicalexamples