Cargando…

Conceptual framework and pilot study to benchmark phylogenomic databases based on reference gene trees

Phylogenomic databases provide orthology predictions for species with fully sequenced genomes. Although the goal seems well-defined, the content of these databases differs greatly. Seven ortholog databases (Ensembl Compara, eggNOG, HOGENOM, InParanoid, OMA, OrthoDB, Panther) were compared on the bas...

Descripción completa

Detalles Bibliográficos
Autores principales: Boeckmann, Brigitte, Robinson-Rechavi, Marc, Xenarios, Ioannis, Dessimoz, Christophe
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3178055/
https://www.ncbi.nlm.nih.gov/pubmed/21737420
http://dx.doi.org/10.1093/bib/bbr034
_version_ 1782212369336762368
author Boeckmann, Brigitte
Robinson-Rechavi, Marc
Xenarios, Ioannis
Dessimoz, Christophe
author_facet Boeckmann, Brigitte
Robinson-Rechavi, Marc
Xenarios, Ioannis
Dessimoz, Christophe
author_sort Boeckmann, Brigitte
collection PubMed
description Phylogenomic databases provide orthology predictions for species with fully sequenced genomes. Although the goal seems well-defined, the content of these databases differs greatly. Seven ortholog databases (Ensembl Compara, eggNOG, HOGENOM, InParanoid, OMA, OrthoDB, Panther) were compared on the basis of reference trees. For three well-conserved protein families, we observed a generally high specificity of orthology assignments for these databases. We show that differences in the completeness of predicted gene relationships and in the phylogenetic information are, for the great majority, not due to the methods used, but to differences in the underlying database concepts. According to our metrics, none of the databases provides a fully correct and comprehensive protein classification. Our results provide a framework for meaningful and systematic comparisons of phylogenomic databases. In the future, a sustainable set of ‘Gold standard’ phylogenetic trees could provide a robust method for phylogenomic databases to assess their current quality status, measure changes following new database releases and diagnose improvements subsequent to an upgrade of the analysis procedure.
format Online
Article
Text
id pubmed-3178055
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-31780552011-09-22 Conceptual framework and pilot study to benchmark phylogenomic databases based on reference gene trees Boeckmann, Brigitte Robinson-Rechavi, Marc Xenarios, Ioannis Dessimoz, Christophe Brief Bioinform Special Issue Papers Phylogenomic databases provide orthology predictions for species with fully sequenced genomes. Although the goal seems well-defined, the content of these databases differs greatly. Seven ortholog databases (Ensembl Compara, eggNOG, HOGENOM, InParanoid, OMA, OrthoDB, Panther) were compared on the basis of reference trees. For three well-conserved protein families, we observed a generally high specificity of orthology assignments for these databases. We show that differences in the completeness of predicted gene relationships and in the phylogenetic information are, for the great majority, not due to the methods used, but to differences in the underlying database concepts. According to our metrics, none of the databases provides a fully correct and comprehensive protein classification. Our results provide a framework for meaningful and systematic comparisons of phylogenomic databases. In the future, a sustainable set of ‘Gold standard’ phylogenetic trees could provide a robust method for phylogenomic databases to assess their current quality status, measure changes following new database releases and diagnose improvements subsequent to an upgrade of the analysis procedure. Oxford University Press 2011-09 2011-07-07 /pmc/articles/PMC3178055/ /pubmed/21737420 http://dx.doi.org/10.1093/bib/bbr034 Text en © The Author(s) 2011. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/3.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Special Issue Papers
Boeckmann, Brigitte
Robinson-Rechavi, Marc
Xenarios, Ioannis
Dessimoz, Christophe
Conceptual framework and pilot study to benchmark phylogenomic databases based on reference gene trees
title Conceptual framework and pilot study to benchmark phylogenomic databases based on reference gene trees
title_full Conceptual framework and pilot study to benchmark phylogenomic databases based on reference gene trees
title_fullStr Conceptual framework and pilot study to benchmark phylogenomic databases based on reference gene trees
title_full_unstemmed Conceptual framework and pilot study to benchmark phylogenomic databases based on reference gene trees
title_short Conceptual framework and pilot study to benchmark phylogenomic databases based on reference gene trees
title_sort conceptual framework and pilot study to benchmark phylogenomic databases based on reference gene trees
topic Special Issue Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3178055/
https://www.ncbi.nlm.nih.gov/pubmed/21737420
http://dx.doi.org/10.1093/bib/bbr034
work_keys_str_mv AT boeckmannbrigitte conceptualframeworkandpilotstudytobenchmarkphylogenomicdatabasesbasedonreferencegenetrees
AT robinsonrechavimarc conceptualframeworkandpilotstudytobenchmarkphylogenomicdatabasesbasedonreferencegenetrees
AT xenariosioannis conceptualframeworkandpilotstudytobenchmarkphylogenomicdatabasesbasedonreferencegenetrees
AT dessimozchristophe conceptualframeworkandpilotstudytobenchmarkphylogenomicdatabasesbasedonreferencegenetrees