Cargando…

Phylogenetic and Functional Assessment of Orthologs Inference Projects and Methods

Accurate genome-wide identification of orthologs is a central problem in comparative genomics, a fact reflected by the numerous orthology identification projects developed in recent years. However, only a few reports have compared their accuracy, and indeed, several recent efforts have not yet been...

Descripción completa

Detalles Bibliográficos
Autores principales: Altenhoff, Adrian M., Dessimoz, Christophe
Formato: Texto
Lenguaje:English
Publicado: Public Library of Science 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2612752/
https://www.ncbi.nlm.nih.gov/pubmed/19148271
http://dx.doi.org/10.1371/journal.pcbi.1000262
_version_ 1782163140614553600
author Altenhoff, Adrian M.
Dessimoz, Christophe
author_facet Altenhoff, Adrian M.
Dessimoz, Christophe
author_sort Altenhoff, Adrian M.
collection PubMed
description Accurate genome-wide identification of orthologs is a central problem in comparative genomics, a fact reflected by the numerous orthology identification projects developed in recent years. However, only a few reports have compared their accuracy, and indeed, several recent efforts have not yet been systematically evaluated. Furthermore, orthology is typically only assessed in terms of function conservation, despite the phylogeny-based original definition of Fitch. We collected and mapped the results of nine leading orthology projects and methods (COG, KOG, Inparanoid, OrthoMCL, Ensembl Compara, Homologene, RoundUp, EggNOG, and OMA) and two standard methods (bidirectional best-hit and reciprocal smallest distance). We systematically compared their predictions with respect to both phylogeny and function, using six different tests. This required the mapping of millions of sequences, the handling of hundreds of millions of predicted pairs of orthologs, and the computation of tens of thousands of trees. In phylogenetic analysis or in functional analysis where high specificity is required, we find that OMA and Homologene perform best. At lower functional specificity but higher coverage level, OrthoMCL outperforms Ensembl Compara, and to a lesser extent Inparanoid. Lastly, the large coverage of the recent EggNOG can be of interest to build broad functional grouping, but the method is not specific enough for phylogenetic or detailed function analyses. In terms of general methodology, we observe that the more sophisticated tree reconstruction/reconciliation approach of Ensembl Compara was at times outperformed by pairwise comparison approaches, even in phylogenetic tests. Furthermore, we show that standard bidirectional best-hit often outperforms projects with more complex algorithms. First, the present study provides guidance for the broad community of orthology data users as to which database best suits their needs. Second, it introduces new methodology to verify orthology. And third, it sets performance standards for current and future approaches.
format Text
id pubmed-2612752
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-26127522009-01-16 Phylogenetic and Functional Assessment of Orthologs Inference Projects and Methods Altenhoff, Adrian M. Dessimoz, Christophe PLoS Comput Biol Research Article Accurate genome-wide identification of orthologs is a central problem in comparative genomics, a fact reflected by the numerous orthology identification projects developed in recent years. However, only a few reports have compared their accuracy, and indeed, several recent efforts have not yet been systematically evaluated. Furthermore, orthology is typically only assessed in terms of function conservation, despite the phylogeny-based original definition of Fitch. We collected and mapped the results of nine leading orthology projects and methods (COG, KOG, Inparanoid, OrthoMCL, Ensembl Compara, Homologene, RoundUp, EggNOG, and OMA) and two standard methods (bidirectional best-hit and reciprocal smallest distance). We systematically compared their predictions with respect to both phylogeny and function, using six different tests. This required the mapping of millions of sequences, the handling of hundreds of millions of predicted pairs of orthologs, and the computation of tens of thousands of trees. In phylogenetic analysis or in functional analysis where high specificity is required, we find that OMA and Homologene perform best. At lower functional specificity but higher coverage level, OrthoMCL outperforms Ensembl Compara, and to a lesser extent Inparanoid. Lastly, the large coverage of the recent EggNOG can be of interest to build broad functional grouping, but the method is not specific enough for phylogenetic or detailed function analyses. In terms of general methodology, we observe that the more sophisticated tree reconstruction/reconciliation approach of Ensembl Compara was at times outperformed by pairwise comparison approaches, even in phylogenetic tests. Furthermore, we show that standard bidirectional best-hit often outperforms projects with more complex algorithms. First, the present study provides guidance for the broad community of orthology data users as to which database best suits their needs. Second, it introduces new methodology to verify orthology. And third, it sets performance standards for current and future approaches. Public Library of Science 2009-01-16 /pmc/articles/PMC2612752/ /pubmed/19148271 http://dx.doi.org/10.1371/journal.pcbi.1000262 Text en Altenhoff, Dessimoz. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Altenhoff, Adrian M.
Dessimoz, Christophe
Phylogenetic and Functional Assessment of Orthologs Inference Projects and Methods
title Phylogenetic and Functional Assessment of Orthologs Inference Projects and Methods
title_full Phylogenetic and Functional Assessment of Orthologs Inference Projects and Methods
title_fullStr Phylogenetic and Functional Assessment of Orthologs Inference Projects and Methods
title_full_unstemmed Phylogenetic and Functional Assessment of Orthologs Inference Projects and Methods
title_short Phylogenetic and Functional Assessment of Orthologs Inference Projects and Methods
title_sort phylogenetic and functional assessment of orthologs inference projects and methods
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2612752/
https://www.ncbi.nlm.nih.gov/pubmed/19148271
http://dx.doi.org/10.1371/journal.pcbi.1000262
work_keys_str_mv AT altenhoffadrianm phylogeneticandfunctionalassessmentoforthologsinferenceprojectsandmethods
AT dessimozchristophe phylogeneticandfunctionalassessmentoforthologsinferenceprojectsandmethods