Cargando…

A Phylogeny-Based Benchmarking Test for Orthology Inference Reveals the Limitations of Function-Based Validation

Accurate orthology prediction is crucial for many applications in the post-genomic era. The lack of broadly accepted benchmark tests precludes a comprehensive analysis of orthology inference. So far, functional annotation between orthologs serves as a performance proxy. However, this violates the fu...

Descripción completa

Detalles Bibliográficos
Autores principales: Trachana, Kalliopi, Forslund, Kristoffer, Larsson, Tomas, Powell, Sean, Doerks, Tobias, von Mering, Christian, Bork, Peer
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4219706/
https://www.ncbi.nlm.nih.gov/pubmed/25369365
http://dx.doi.org/10.1371/journal.pone.0111122
_version_ 1782342624026296320
author Trachana, Kalliopi
Forslund, Kristoffer
Larsson, Tomas
Powell, Sean
Doerks, Tobias
von Mering, Christian
Bork, Peer
author_facet Trachana, Kalliopi
Forslund, Kristoffer
Larsson, Tomas
Powell, Sean
Doerks, Tobias
von Mering, Christian
Bork, Peer
author_sort Trachana, Kalliopi
collection PubMed
description Accurate orthology prediction is crucial for many applications in the post-genomic era. The lack of broadly accepted benchmark tests precludes a comprehensive analysis of orthology inference. So far, functional annotation between orthologs serves as a performance proxy. However, this violates the fundamental principle of orthology as an evolutionary definition, while it is often not applicable due to limited experimental evidence for most species. Therefore, we constructed high quality "gold standard" orthologous groups that can serve as a benchmark set for orthology inference in bacterial species. Herein, we used this dataset to demonstrate 1) why a manually curated, phylogeny-based dataset is more appropriate for benchmarking orthology than other popular practices and 2) how it guides database design and parameterization through careful error quantification. More specifically, we illustrate how function-based tests often fail to identify false assignments, misjudging the true performance of orthology inference methods. We also examined how our dataset can instruct the selection of a “core” species repertoire to improve detection accuracy. We conclude that including more genomes at the proper evolutionary distances can influence the overall quality of orthology detection. The curated gene families, called Reference Orthologous Groups, are publicly available at http://eggnog.embl.de/orthobench2.
format Online
Article
Text
id pubmed-4219706
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-42197062014-11-12 A Phylogeny-Based Benchmarking Test for Orthology Inference Reveals the Limitations of Function-Based Validation Trachana, Kalliopi Forslund, Kristoffer Larsson, Tomas Powell, Sean Doerks, Tobias von Mering, Christian Bork, Peer PLoS One Research Article Accurate orthology prediction is crucial for many applications in the post-genomic era. The lack of broadly accepted benchmark tests precludes a comprehensive analysis of orthology inference. So far, functional annotation between orthologs serves as a performance proxy. However, this violates the fundamental principle of orthology as an evolutionary definition, while it is often not applicable due to limited experimental evidence for most species. Therefore, we constructed high quality "gold standard" orthologous groups that can serve as a benchmark set for orthology inference in bacterial species. Herein, we used this dataset to demonstrate 1) why a manually curated, phylogeny-based dataset is more appropriate for benchmarking orthology than other popular practices and 2) how it guides database design and parameterization through careful error quantification. More specifically, we illustrate how function-based tests often fail to identify false assignments, misjudging the true performance of orthology inference methods. We also examined how our dataset can instruct the selection of a “core” species repertoire to improve detection accuracy. We conclude that including more genomes at the proper evolutionary distances can influence the overall quality of orthology detection. The curated gene families, called Reference Orthologous Groups, are publicly available at http://eggnog.embl.de/orthobench2. Public Library of Science 2014-11-04 /pmc/articles/PMC4219706/ /pubmed/25369365 http://dx.doi.org/10.1371/journal.pone.0111122 Text en © 2014 Trachana et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Trachana, Kalliopi
Forslund, Kristoffer
Larsson, Tomas
Powell, Sean
Doerks, Tobias
von Mering, Christian
Bork, Peer
A Phylogeny-Based Benchmarking Test for Orthology Inference Reveals the Limitations of Function-Based Validation
title A Phylogeny-Based Benchmarking Test for Orthology Inference Reveals the Limitations of Function-Based Validation
title_full A Phylogeny-Based Benchmarking Test for Orthology Inference Reveals the Limitations of Function-Based Validation
title_fullStr A Phylogeny-Based Benchmarking Test for Orthology Inference Reveals the Limitations of Function-Based Validation
title_full_unstemmed A Phylogeny-Based Benchmarking Test for Orthology Inference Reveals the Limitations of Function-Based Validation
title_short A Phylogeny-Based Benchmarking Test for Orthology Inference Reveals the Limitations of Function-Based Validation
title_sort phylogeny-based benchmarking test for orthology inference reveals the limitations of function-based validation
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4219706/
https://www.ncbi.nlm.nih.gov/pubmed/25369365
http://dx.doi.org/10.1371/journal.pone.0111122
work_keys_str_mv AT trachanakalliopi aphylogenybasedbenchmarkingtestfororthologyinferencerevealsthelimitationsoffunctionbasedvalidation
AT forslundkristoffer aphylogenybasedbenchmarkingtestfororthologyinferencerevealsthelimitationsoffunctionbasedvalidation
AT larssontomas aphylogenybasedbenchmarkingtestfororthologyinferencerevealsthelimitationsoffunctionbasedvalidation
AT powellsean aphylogenybasedbenchmarkingtestfororthologyinferencerevealsthelimitationsoffunctionbasedvalidation
AT doerkstobias aphylogenybasedbenchmarkingtestfororthologyinferencerevealsthelimitationsoffunctionbasedvalidation
AT vonmeringchristian aphylogenybasedbenchmarkingtestfororthologyinferencerevealsthelimitationsoffunctionbasedvalidation
AT borkpeer aphylogenybasedbenchmarkingtestfororthologyinferencerevealsthelimitationsoffunctionbasedvalidation
AT trachanakalliopi phylogenybasedbenchmarkingtestfororthologyinferencerevealsthelimitationsoffunctionbasedvalidation
AT forslundkristoffer phylogenybasedbenchmarkingtestfororthologyinferencerevealsthelimitationsoffunctionbasedvalidation
AT larssontomas phylogenybasedbenchmarkingtestfororthologyinferencerevealsthelimitationsoffunctionbasedvalidation
AT powellsean phylogenybasedbenchmarkingtestfororthologyinferencerevealsthelimitationsoffunctionbasedvalidation
AT doerkstobias phylogenybasedbenchmarkingtestfororthologyinferencerevealsthelimitationsoffunctionbasedvalidation
AT vonmeringchristian phylogenybasedbenchmarkingtestfororthologyinferencerevealsthelimitationsoffunctionbasedvalidation
AT borkpeer phylogenybasedbenchmarkingtestfororthologyinferencerevealsthelimitationsoffunctionbasedvalidation