Cargando…
Alignathon: a competitive assessment of whole-genome alignment methods
Multiple sequence alignments (MSAs) are a prerequisite for a wide variety of evolutionary analyses. Published assessments and benchmark data sets for protein and, to a lesser extent, global nucleotide MSAs are available, but less effort has been made to establish benchmarks in the more general probl...
Autores principales: | , , , , , , , , , , , , , , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Cold Spring Harbor Laboratory Press
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4248324/ https://www.ncbi.nlm.nih.gov/pubmed/25273068 http://dx.doi.org/10.1101/gr.174920.114 |
_version_ | 1782346779451195392 |
---|---|
author | Earl, Dent Nguyen, Ngan Hickey, Glenn Harris, Robert S. Fitzgerald, Stephen Beal, Kathryn Seledtsov, Igor Molodtsov, Vladimir Raney, Brian J. Clawson, Hiram Kim, Jaebum Kemena, Carsten Chang, Jia-Ming Erb, Ionas Poliakov, Alexander Hou, Minmei Herrero, Javier Kent, William James Solovyev, Victor Darling, Aaron E. Ma, Jian Notredame, Cedric Brudno, Michael Dubchak, Inna Haussler, David Paten, Benedict |
author_facet | Earl, Dent Nguyen, Ngan Hickey, Glenn Harris, Robert S. Fitzgerald, Stephen Beal, Kathryn Seledtsov, Igor Molodtsov, Vladimir Raney, Brian J. Clawson, Hiram Kim, Jaebum Kemena, Carsten Chang, Jia-Ming Erb, Ionas Poliakov, Alexander Hou, Minmei Herrero, Javier Kent, William James Solovyev, Victor Darling, Aaron E. Ma, Jian Notredame, Cedric Brudno, Michael Dubchak, Inna Haussler, David Paten, Benedict |
author_sort | Earl, Dent |
collection | PubMed |
description | Multiple sequence alignments (MSAs) are a prerequisite for a wide variety of evolutionary analyses. Published assessments and benchmark data sets for protein and, to a lesser extent, global nucleotide MSAs are available, but less effort has been made to establish benchmarks in the more general problem of whole-genome alignment (WGA). Using the same model as the successful Assemblathon competitions, we organized a competitive evaluation in which teams submitted their alignments and then assessments were performed collectively after all the submissions were received. Three data sets were used: Two were simulated and based on primate and mammalian phylogenies, and one was comprised of 20 real fly genomes. In total, 35 submissions were assessed, submitted by 10 teams using 12 different alignment pipelines. We found agreement between independent simulation-based and statistical assessments, indicating that there are substantial accuracy differences between contemporary alignment tools. We saw considerable differences in the alignment quality of differently annotated regions and found that few tools aligned the duplications analyzed. We found that many tools worked well at shorter evolutionary distances, but fewer performed competitively at longer distances. We provide all data sets, submissions, and assessment programs for further study and provide, as a resource for future benchmarking, a convenient repository of code and data for reproducing the simulation assessments. |
format | Online Article Text |
id | pubmed-4248324 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | Cold Spring Harbor Laboratory Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-42483242014-12-01 Alignathon: a competitive assessment of whole-genome alignment methods Earl, Dent Nguyen, Ngan Hickey, Glenn Harris, Robert S. Fitzgerald, Stephen Beal, Kathryn Seledtsov, Igor Molodtsov, Vladimir Raney, Brian J. Clawson, Hiram Kim, Jaebum Kemena, Carsten Chang, Jia-Ming Erb, Ionas Poliakov, Alexander Hou, Minmei Herrero, Javier Kent, William James Solovyev, Victor Darling, Aaron E. Ma, Jian Notredame, Cedric Brudno, Michael Dubchak, Inna Haussler, David Paten, Benedict Genome Res Resource Multiple sequence alignments (MSAs) are a prerequisite for a wide variety of evolutionary analyses. Published assessments and benchmark data sets for protein and, to a lesser extent, global nucleotide MSAs are available, but less effort has been made to establish benchmarks in the more general problem of whole-genome alignment (WGA). Using the same model as the successful Assemblathon competitions, we organized a competitive evaluation in which teams submitted their alignments and then assessments were performed collectively after all the submissions were received. Three data sets were used: Two were simulated and based on primate and mammalian phylogenies, and one was comprised of 20 real fly genomes. In total, 35 submissions were assessed, submitted by 10 teams using 12 different alignment pipelines. We found agreement between independent simulation-based and statistical assessments, indicating that there are substantial accuracy differences between contemporary alignment tools. We saw considerable differences in the alignment quality of differently annotated regions and found that few tools aligned the duplications analyzed. We found that many tools worked well at shorter evolutionary distances, but fewer performed competitively at longer distances. We provide all data sets, submissions, and assessment programs for further study and provide, as a resource for future benchmarking, a convenient repository of code and data for reproducing the simulation assessments. Cold Spring Harbor Laboratory Press 2014-12 /pmc/articles/PMC4248324/ /pubmed/25273068 http://dx.doi.org/10.1101/gr.174920.114 Text en © 2014 Earl et al.; Published by Cold Spring Harbor Laboratory Press http://creativecommons.org/licenses/by/4.0/ This article, published in Genome Research, is available under a Creative Commons License (Attribution 4.0 International), as described at http://creativecommons.org/licenses/by/4.0. |
spellingShingle | Resource Earl, Dent Nguyen, Ngan Hickey, Glenn Harris, Robert S. Fitzgerald, Stephen Beal, Kathryn Seledtsov, Igor Molodtsov, Vladimir Raney, Brian J. Clawson, Hiram Kim, Jaebum Kemena, Carsten Chang, Jia-Ming Erb, Ionas Poliakov, Alexander Hou, Minmei Herrero, Javier Kent, William James Solovyev, Victor Darling, Aaron E. Ma, Jian Notredame, Cedric Brudno, Michael Dubchak, Inna Haussler, David Paten, Benedict Alignathon: a competitive assessment of whole-genome alignment methods |
title | Alignathon: a competitive assessment of whole-genome alignment methods |
title_full | Alignathon: a competitive assessment of whole-genome alignment methods |
title_fullStr | Alignathon: a competitive assessment of whole-genome alignment methods |
title_full_unstemmed | Alignathon: a competitive assessment of whole-genome alignment methods |
title_short | Alignathon: a competitive assessment of whole-genome alignment methods |
title_sort | alignathon: a competitive assessment of whole-genome alignment methods |
topic | Resource |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4248324/ https://www.ncbi.nlm.nih.gov/pubmed/25273068 http://dx.doi.org/10.1101/gr.174920.114 |
work_keys_str_mv | AT earldent alignathonacompetitiveassessmentofwholegenomealignmentmethods AT nguyenngan alignathonacompetitiveassessmentofwholegenomealignmentmethods AT hickeyglenn alignathonacompetitiveassessmentofwholegenomealignmentmethods AT harrisroberts alignathonacompetitiveassessmentofwholegenomealignmentmethods AT fitzgeraldstephen alignathonacompetitiveassessmentofwholegenomealignmentmethods AT bealkathryn alignathonacompetitiveassessmentofwholegenomealignmentmethods AT seledtsovigor alignathonacompetitiveassessmentofwholegenomealignmentmethods AT molodtsovvladimir alignathonacompetitiveassessmentofwholegenomealignmentmethods AT raneybrianj alignathonacompetitiveassessmentofwholegenomealignmentmethods AT clawsonhiram alignathonacompetitiveassessmentofwholegenomealignmentmethods AT kimjaebum alignathonacompetitiveassessmentofwholegenomealignmentmethods AT kemenacarsten alignathonacompetitiveassessmentofwholegenomealignmentmethods AT changjiaming alignathonacompetitiveassessmentofwholegenomealignmentmethods AT erbionas alignathonacompetitiveassessmentofwholegenomealignmentmethods AT poliakovalexander alignathonacompetitiveassessmentofwholegenomealignmentmethods AT houminmei alignathonacompetitiveassessmentofwholegenomealignmentmethods AT herrerojavier alignathonacompetitiveassessmentofwholegenomealignmentmethods AT kentwilliamjames alignathonacompetitiveassessmentofwholegenomealignmentmethods AT solovyevvictor alignathonacompetitiveassessmentofwholegenomealignmentmethods AT darlingaarone alignathonacompetitiveassessmentofwholegenomealignmentmethods AT majian alignathonacompetitiveassessmentofwholegenomealignmentmethods AT notredamecedric alignathonacompetitiveassessmentofwholegenomealignmentmethods AT brudnomichael alignathonacompetitiveassessmentofwholegenomealignmentmethods AT dubchakinna alignathonacompetitiveassessmentofwholegenomealignmentmethods AT hausslerdavid alignathonacompetitiveassessmentofwholegenomealignmentmethods AT patenbenedict alignathonacompetitiveassessmentofwholegenomealignmentmethods |