Cargando…

Alignathon: a competitive assessment of whole-genome alignment methods

Multiple sequence alignments (MSAs) are a prerequisite for a wide variety of evolutionary analyses. Published assessments and benchmark data sets for protein and, to a lesser extent, global nucleotide MSAs are available, but less effort has been made to establish benchmarks in the more general probl...

Descripción completa

Detalles Bibliográficos
Autores principales: Earl, Dent, Nguyen, Ngan, Hickey, Glenn, Harris, Robert S., Fitzgerald, Stephen, Beal, Kathryn, Seledtsov, Igor, Molodtsov, Vladimir, Raney, Brian J., Clawson, Hiram, Kim, Jaebum, Kemena, Carsten, Chang, Jia-Ming, Erb, Ionas, Poliakov, Alexander, Hou, Minmei, Herrero, Javier, Kent, William James, Solovyev, Victor, Darling, Aaron E., Ma, Jian, Notredame, Cedric, Brudno, Michael, Dubchak, Inna, Haussler, David, Paten, Benedict
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory Press 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4248324/
https://www.ncbi.nlm.nih.gov/pubmed/25273068
http://dx.doi.org/10.1101/gr.174920.114
_version_ 1782346779451195392
author Earl, Dent
Nguyen, Ngan
Hickey, Glenn
Harris, Robert S.
Fitzgerald, Stephen
Beal, Kathryn
Seledtsov, Igor
Molodtsov, Vladimir
Raney, Brian J.
Clawson, Hiram
Kim, Jaebum
Kemena, Carsten
Chang, Jia-Ming
Erb, Ionas
Poliakov, Alexander
Hou, Minmei
Herrero, Javier
Kent, William James
Solovyev, Victor
Darling, Aaron E.
Ma, Jian
Notredame, Cedric
Brudno, Michael
Dubchak, Inna
Haussler, David
Paten, Benedict
author_facet Earl, Dent
Nguyen, Ngan
Hickey, Glenn
Harris, Robert S.
Fitzgerald, Stephen
Beal, Kathryn
Seledtsov, Igor
Molodtsov, Vladimir
Raney, Brian J.
Clawson, Hiram
Kim, Jaebum
Kemena, Carsten
Chang, Jia-Ming
Erb, Ionas
Poliakov, Alexander
Hou, Minmei
Herrero, Javier
Kent, William James
Solovyev, Victor
Darling, Aaron E.
Ma, Jian
Notredame, Cedric
Brudno, Michael
Dubchak, Inna
Haussler, David
Paten, Benedict
author_sort Earl, Dent
collection PubMed
description Multiple sequence alignments (MSAs) are a prerequisite for a wide variety of evolutionary analyses. Published assessments and benchmark data sets for protein and, to a lesser extent, global nucleotide MSAs are available, but less effort has been made to establish benchmarks in the more general problem of whole-genome alignment (WGA). Using the same model as the successful Assemblathon competitions, we organized a competitive evaluation in which teams submitted their alignments and then assessments were performed collectively after all the submissions were received. Three data sets were used: Two were simulated and based on primate and mammalian phylogenies, and one was comprised of 20 real fly genomes. In total, 35 submissions were assessed, submitted by 10 teams using 12 different alignment pipelines. We found agreement between independent simulation-based and statistical assessments, indicating that there are substantial accuracy differences between contemporary alignment tools. We saw considerable differences in the alignment quality of differently annotated regions and found that few tools aligned the duplications analyzed. We found that many tools worked well at shorter evolutionary distances, but fewer performed competitively at longer distances. We provide all data sets, submissions, and assessment programs for further study and provide, as a resource for future benchmarking, a convenient repository of code and data for reproducing the simulation assessments.
format Online
Article
Text
id pubmed-4248324
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Cold Spring Harbor Laboratory Press
record_format MEDLINE/PubMed
spelling pubmed-42483242014-12-01 Alignathon: a competitive assessment of whole-genome alignment methods Earl, Dent Nguyen, Ngan Hickey, Glenn Harris, Robert S. Fitzgerald, Stephen Beal, Kathryn Seledtsov, Igor Molodtsov, Vladimir Raney, Brian J. Clawson, Hiram Kim, Jaebum Kemena, Carsten Chang, Jia-Ming Erb, Ionas Poliakov, Alexander Hou, Minmei Herrero, Javier Kent, William James Solovyev, Victor Darling, Aaron E. Ma, Jian Notredame, Cedric Brudno, Michael Dubchak, Inna Haussler, David Paten, Benedict Genome Res Resource Multiple sequence alignments (MSAs) are a prerequisite for a wide variety of evolutionary analyses. Published assessments and benchmark data sets for protein and, to a lesser extent, global nucleotide MSAs are available, but less effort has been made to establish benchmarks in the more general problem of whole-genome alignment (WGA). Using the same model as the successful Assemblathon competitions, we organized a competitive evaluation in which teams submitted their alignments and then assessments were performed collectively after all the submissions were received. Three data sets were used: Two were simulated and based on primate and mammalian phylogenies, and one was comprised of 20 real fly genomes. In total, 35 submissions were assessed, submitted by 10 teams using 12 different alignment pipelines. We found agreement between independent simulation-based and statistical assessments, indicating that there are substantial accuracy differences between contemporary alignment tools. We saw considerable differences in the alignment quality of differently annotated regions and found that few tools aligned the duplications analyzed. We found that many tools worked well at shorter evolutionary distances, but fewer performed competitively at longer distances. We provide all data sets, submissions, and assessment programs for further study and provide, as a resource for future benchmarking, a convenient repository of code and data for reproducing the simulation assessments. Cold Spring Harbor Laboratory Press 2014-12 /pmc/articles/PMC4248324/ /pubmed/25273068 http://dx.doi.org/10.1101/gr.174920.114 Text en © 2014 Earl et al.; Published by Cold Spring Harbor Laboratory Press http://creativecommons.org/licenses/by/4.0/ This article, published in Genome Research, is available under a Creative Commons License (Attribution 4.0 International), as described at http://creativecommons.org/licenses/by/4.0.
spellingShingle Resource
Earl, Dent
Nguyen, Ngan
Hickey, Glenn
Harris, Robert S.
Fitzgerald, Stephen
Beal, Kathryn
Seledtsov, Igor
Molodtsov, Vladimir
Raney, Brian J.
Clawson, Hiram
Kim, Jaebum
Kemena, Carsten
Chang, Jia-Ming
Erb, Ionas
Poliakov, Alexander
Hou, Minmei
Herrero, Javier
Kent, William James
Solovyev, Victor
Darling, Aaron E.
Ma, Jian
Notredame, Cedric
Brudno, Michael
Dubchak, Inna
Haussler, David
Paten, Benedict
Alignathon: a competitive assessment of whole-genome alignment methods
title Alignathon: a competitive assessment of whole-genome alignment methods
title_full Alignathon: a competitive assessment of whole-genome alignment methods
title_fullStr Alignathon: a competitive assessment of whole-genome alignment methods
title_full_unstemmed Alignathon: a competitive assessment of whole-genome alignment methods
title_short Alignathon: a competitive assessment of whole-genome alignment methods
title_sort alignathon: a competitive assessment of whole-genome alignment methods
topic Resource
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4248324/
https://www.ncbi.nlm.nih.gov/pubmed/25273068
http://dx.doi.org/10.1101/gr.174920.114
work_keys_str_mv AT earldent alignathonacompetitiveassessmentofwholegenomealignmentmethods
AT nguyenngan alignathonacompetitiveassessmentofwholegenomealignmentmethods
AT hickeyglenn alignathonacompetitiveassessmentofwholegenomealignmentmethods
AT harrisroberts alignathonacompetitiveassessmentofwholegenomealignmentmethods
AT fitzgeraldstephen alignathonacompetitiveassessmentofwholegenomealignmentmethods
AT bealkathryn alignathonacompetitiveassessmentofwholegenomealignmentmethods
AT seledtsovigor alignathonacompetitiveassessmentofwholegenomealignmentmethods
AT molodtsovvladimir alignathonacompetitiveassessmentofwholegenomealignmentmethods
AT raneybrianj alignathonacompetitiveassessmentofwholegenomealignmentmethods
AT clawsonhiram alignathonacompetitiveassessmentofwholegenomealignmentmethods
AT kimjaebum alignathonacompetitiveassessmentofwholegenomealignmentmethods
AT kemenacarsten alignathonacompetitiveassessmentofwholegenomealignmentmethods
AT changjiaming alignathonacompetitiveassessmentofwholegenomealignmentmethods
AT erbionas alignathonacompetitiveassessmentofwholegenomealignmentmethods
AT poliakovalexander alignathonacompetitiveassessmentofwholegenomealignmentmethods
AT houminmei alignathonacompetitiveassessmentofwholegenomealignmentmethods
AT herrerojavier alignathonacompetitiveassessmentofwholegenomealignmentmethods
AT kentwilliamjames alignathonacompetitiveassessmentofwholegenomealignmentmethods
AT solovyevvictor alignathonacompetitiveassessmentofwholegenomealignmentmethods
AT darlingaarone alignathonacompetitiveassessmentofwholegenomealignmentmethods
AT majian alignathonacompetitiveassessmentofwholegenomealignmentmethods
AT notredamecedric alignathonacompetitiveassessmentofwholegenomealignmentmethods
AT brudnomichael alignathonacompetitiveassessmentofwholegenomealignmentmethods
AT dubchakinna alignathonacompetitiveassessmentofwholegenomealignmentmethods
AT hausslerdavid alignathonacompetitiveassessmentofwholegenomealignmentmethods
AT patenbenedict alignathonacompetitiveassessmentofwholegenomealignmentmethods