Cargando…

Detecting non-orthology in the COGs database and other approaches grouping orthologs using genome-specific best hits

Correct orthology assignment is a critical prerequisite of numerous comparative genomics procedures, such as function prediction, construction of phylogenetic species trees and genome rearrangement analysis. We present an algorithm for the detection of non-orthologs that arise by mistake in current...

Descripción completa

Detalles Bibliográficos
Autores principales: Dessimoz, Christophe, Boeckmann, Brigitte, Roth, Alexander C. J., Gonnet, Gaston H.
Formato: Texto
Lenguaje:English
Publicado: Oxford University Press 2006
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1500873/
https://www.ncbi.nlm.nih.gov/pubmed/16835308
http://dx.doi.org/10.1093/nar/gkl433
_version_ 1782128378065715200
author Dessimoz, Christophe
Boeckmann, Brigitte
Roth, Alexander C. J.
Gonnet, Gaston H.
author_facet Dessimoz, Christophe
Boeckmann, Brigitte
Roth, Alexander C. J.
Gonnet, Gaston H.
author_sort Dessimoz, Christophe
collection PubMed
description Correct orthology assignment is a critical prerequisite of numerous comparative genomics procedures, such as function prediction, construction of phylogenetic species trees and genome rearrangement analysis. We present an algorithm for the detection of non-orthologs that arise by mistake in current orthology classification methods based on genome-specific best hits, such as the COGs database. The algorithm works with pairwise distance estimates, rather than computationally expensive and error-prone tree-building methods. The accuracy of the algorithm is evaluated through verification of the distribution of predicted cases, case-by-case phylogenetic analysis and comparisons with predictions from other projects using independent methods. Our results show that a very significant fraction of the COG groups include non-orthologs: using conservative parameters, the algorithm detects non-orthology in a third of all COG groups. Consequently, sequence analysis sensitive to correct orthology assignments will greatly benefit from these findings.
format Text
id pubmed-1500873
institution National Center for Biotechnology Information
language English
publishDate 2006
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-15008732006-07-13 Detecting non-orthology in the COGs database and other approaches grouping orthologs using genome-specific best hits Dessimoz, Christophe Boeckmann, Brigitte Roth, Alexander C. J. Gonnet, Gaston H. Nucleic Acids Res Article Correct orthology assignment is a critical prerequisite of numerous comparative genomics procedures, such as function prediction, construction of phylogenetic species trees and genome rearrangement analysis. We present an algorithm for the detection of non-orthologs that arise by mistake in current orthology classification methods based on genome-specific best hits, such as the COGs database. The algorithm works with pairwise distance estimates, rather than computationally expensive and error-prone tree-building methods. The accuracy of the algorithm is evaluated through verification of the distribution of predicted cases, case-by-case phylogenetic analysis and comparisons with predictions from other projects using independent methods. Our results show that a very significant fraction of the COG groups include non-orthologs: using conservative parameters, the algorithm detects non-orthology in a third of all COG groups. Consequently, sequence analysis sensitive to correct orthology assignments will greatly benefit from these findings. Oxford University Press 2006 2006-07-11 /pmc/articles/PMC1500873/ /pubmed/16835308 http://dx.doi.org/10.1093/nar/gkl433 Text en © 2006 The Author(s)
spellingShingle Article
Dessimoz, Christophe
Boeckmann, Brigitte
Roth, Alexander C. J.
Gonnet, Gaston H.
Detecting non-orthology in the COGs database and other approaches grouping orthologs using genome-specific best hits
title Detecting non-orthology in the COGs database and other approaches grouping orthologs using genome-specific best hits
title_full Detecting non-orthology in the COGs database and other approaches grouping orthologs using genome-specific best hits
title_fullStr Detecting non-orthology in the COGs database and other approaches grouping orthologs using genome-specific best hits
title_full_unstemmed Detecting non-orthology in the COGs database and other approaches grouping orthologs using genome-specific best hits
title_short Detecting non-orthology in the COGs database and other approaches grouping orthologs using genome-specific best hits
title_sort detecting non-orthology in the cogs database and other approaches grouping orthologs using genome-specific best hits
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1500873/
https://www.ncbi.nlm.nih.gov/pubmed/16835308
http://dx.doi.org/10.1093/nar/gkl433
work_keys_str_mv AT dessimozchristophe detectingnonorthologyinthecogsdatabaseandotherapproachesgroupingorthologsusinggenomespecificbesthits
AT boeckmannbrigitte detectingnonorthologyinthecogsdatabaseandotherapproachesgroupingorthologsusinggenomespecificbesthits
AT rothalexandercj detectingnonorthologyinthecogsdatabaseandotherapproachesgroupingorthologsusinggenomespecificbesthits
AT gonnetgastonh detectingnonorthologyinthecogsdatabaseandotherapproachesgroupingorthologsusinggenomespecificbesthits