Cargando…
QuartetS: a fast and accurate algorithm for large-scale orthology detection
The unparalleled growth in the availability of genomic data offers both a challenge to develop orthology detection methods that are simultaneously accurate and high throughput and an opportunity to improve orthology detection by leveraging evolutionary evidence in the accumulated sequenced genomes....
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2011
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3141274/ https://www.ncbi.nlm.nih.gov/pubmed/21572104 http://dx.doi.org/10.1093/nar/gkr308 |
_version_ | 1782208652050956288 |
---|---|
author | Yu, Chenggang Zavaljevski, Nela Desai, Valmik Reifman, Jaques |
author_facet | Yu, Chenggang Zavaljevski, Nela Desai, Valmik Reifman, Jaques |
author_sort | Yu, Chenggang |
collection | PubMed |
description | The unparalleled growth in the availability of genomic data offers both a challenge to develop orthology detection methods that are simultaneously accurate and high throughput and an opportunity to improve orthology detection by leveraging evolutionary evidence in the accumulated sequenced genomes. Here, we report a novel orthology detection method, termed QuartetS, that exploits evolutionary evidence in a computationally efficient manner. Based on the well-established evolutionary concept that gene duplication events can be used to discriminate homologous genes, QuartetS uses an approximate phylogenetic analysis of quartet gene trees to infer the occurrence of duplication events and discriminate paralogous from orthologous genes. We used function- and phylogeny-based metrics to perform a large-scale, systematic comparison of the orthology predictions of QuartetS with those of four other methods [bi-directional best hit (BBH), outgroup, OMA and QuartetS-C (QuartetS followed by clustering)], involving 624 bacterial genomes and >2 million genes. We found that QuartetS slightly, but consistently, outperformed the highly specific OMA method and that, while consuming only 0.5% additional computational time, QuartetS predicted 50% more orthologs with a 50% lower false positive rate than the widely used BBH method. We conclude that, for large-scale phylogenetic and functional analysis, QuartetS and QuartetS-C should be preferred, respectively, in applications where high accuracy and high throughput are required. |
format | Online Article Text |
id | pubmed-3141274 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2011 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-31412742011-07-22 QuartetS: a fast and accurate algorithm for large-scale orthology detection Yu, Chenggang Zavaljevski, Nela Desai, Valmik Reifman, Jaques Nucleic Acids Res Methods Online The unparalleled growth in the availability of genomic data offers both a challenge to develop orthology detection methods that are simultaneously accurate and high throughput and an opportunity to improve orthology detection by leveraging evolutionary evidence in the accumulated sequenced genomes. Here, we report a novel orthology detection method, termed QuartetS, that exploits evolutionary evidence in a computationally efficient manner. Based on the well-established evolutionary concept that gene duplication events can be used to discriminate homologous genes, QuartetS uses an approximate phylogenetic analysis of quartet gene trees to infer the occurrence of duplication events and discriminate paralogous from orthologous genes. We used function- and phylogeny-based metrics to perform a large-scale, systematic comparison of the orthology predictions of QuartetS with those of four other methods [bi-directional best hit (BBH), outgroup, OMA and QuartetS-C (QuartetS followed by clustering)], involving 624 bacterial genomes and >2 million genes. We found that QuartetS slightly, but consistently, outperformed the highly specific OMA method and that, while consuming only 0.5% additional computational time, QuartetS predicted 50% more orthologs with a 50% lower false positive rate than the widely used BBH method. We conclude that, for large-scale phylogenetic and functional analysis, QuartetS and QuartetS-C should be preferred, respectively, in applications where high accuracy and high throughput are required. Oxford University Press 2011-07 2011-05-13 /pmc/articles/PMC3141274/ /pubmed/21572104 http://dx.doi.org/10.1093/nar/gkr308 Text en © The Author(s) 2011. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/3.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Methods Online Yu, Chenggang Zavaljevski, Nela Desai, Valmik Reifman, Jaques QuartetS: a fast and accurate algorithm for large-scale orthology detection |
title | QuartetS: a fast and accurate algorithm for large-scale orthology detection |
title_full | QuartetS: a fast and accurate algorithm for large-scale orthology detection |
title_fullStr | QuartetS: a fast and accurate algorithm for large-scale orthology detection |
title_full_unstemmed | QuartetS: a fast and accurate algorithm for large-scale orthology detection |
title_short | QuartetS: a fast and accurate algorithm for large-scale orthology detection |
title_sort | quartets: a fast and accurate algorithm for large-scale orthology detection |
topic | Methods Online |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3141274/ https://www.ncbi.nlm.nih.gov/pubmed/21572104 http://dx.doi.org/10.1093/nar/gkr308 |
work_keys_str_mv | AT yuchenggang quartetsafastandaccuratealgorithmforlargescaleorthologydetection AT zavaljevskinela quartetsafastandaccuratealgorithmforlargescaleorthologydetection AT desaivalmik quartetsafastandaccuratealgorithmforlargescaleorthologydetection AT reifmanjaques quartetsafastandaccuratealgorithmforlargescaleorthologydetection |