Cargando…

QuickProbs 2: Towards rapid construction of high-quality alignments of large protein families

The ever-increasing size of sequence databases caused by the development of high throughput sequencing, poses to multiple alignment algorithms one of the greatest challenges yet. As we show, well-established techniques employed for increasing alignment quality, i.e., refinement and consistency, are...

Descripción completa

Detalles Bibliográficos
Autores principales: Gudyś, Adam, Deorowicz, Sebastian
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5282490/
https://www.ncbi.nlm.nih.gov/pubmed/28139687
http://dx.doi.org/10.1038/srep41553
_version_ 1782503331702243328
author Gudyś, Adam
Deorowicz, Sebastian
author_facet Gudyś, Adam
Deorowicz, Sebastian
author_sort Gudyś, Adam
collection PubMed
description The ever-increasing size of sequence databases caused by the development of high throughput sequencing, poses to multiple alignment algorithms one of the greatest challenges yet. As we show, well-established techniques employed for increasing alignment quality, i.e., refinement and consistency, are ineffective when large protein families are investigated. We present QuickProbs 2, an algorithm for multiple sequence alignment. Based on probabilistic models, equipped with novel column-oriented refinement and selective consistency, it offers outstanding accuracy. When analysing hundreds of sequences, Quick-Probs 2 is noticeably better than ClustalΩ and MAFFT, the previous leaders for processing numerous protein families. In the case of smaller sets, for which consistency-based methods are the best performing, QuickProbs 2 is also superior to the competitors. Due to low computational requirements of selective consistency and utilization of massively parallel architectures, presented algorithm has similar execution times to ClustalΩ, and is orders of magnitude faster than full consistency approaches, like MSAProbs or PicXAA. All these make QuickProbs 2 an excellent tool for aligning families ranging from few, to hundreds of proteins.
format Online
Article
Text
id pubmed-5282490
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Nature Publishing Group
record_format MEDLINE/PubMed
spelling pubmed-52824902017-02-03 QuickProbs 2: Towards rapid construction of high-quality alignments of large protein families Gudyś, Adam Deorowicz, Sebastian Sci Rep Article The ever-increasing size of sequence databases caused by the development of high throughput sequencing, poses to multiple alignment algorithms one of the greatest challenges yet. As we show, well-established techniques employed for increasing alignment quality, i.e., refinement and consistency, are ineffective when large protein families are investigated. We present QuickProbs 2, an algorithm for multiple sequence alignment. Based on probabilistic models, equipped with novel column-oriented refinement and selective consistency, it offers outstanding accuracy. When analysing hundreds of sequences, Quick-Probs 2 is noticeably better than ClustalΩ and MAFFT, the previous leaders for processing numerous protein families. In the case of smaller sets, for which consistency-based methods are the best performing, QuickProbs 2 is also superior to the competitors. Due to low computational requirements of selective consistency and utilization of massively parallel architectures, presented algorithm has similar execution times to ClustalΩ, and is orders of magnitude faster than full consistency approaches, like MSAProbs or PicXAA. All these make QuickProbs 2 an excellent tool for aligning families ranging from few, to hundreds of proteins. Nature Publishing Group 2017-01-31 /pmc/articles/PMC5282490/ /pubmed/28139687 http://dx.doi.org/10.1038/srep41553 Text en Copyright © 2017, The Author(s) http://creativecommons.org/licenses/by/4.0/ This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/
spellingShingle Article
Gudyś, Adam
Deorowicz, Sebastian
QuickProbs 2: Towards rapid construction of high-quality alignments of large protein families
title QuickProbs 2: Towards rapid construction of high-quality alignments of large protein families
title_full QuickProbs 2: Towards rapid construction of high-quality alignments of large protein families
title_fullStr QuickProbs 2: Towards rapid construction of high-quality alignments of large protein families
title_full_unstemmed QuickProbs 2: Towards rapid construction of high-quality alignments of large protein families
title_short QuickProbs 2: Towards rapid construction of high-quality alignments of large protein families
title_sort quickprobs 2: towards rapid construction of high-quality alignments of large protein families
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5282490/
https://www.ncbi.nlm.nih.gov/pubmed/28139687
http://dx.doi.org/10.1038/srep41553
work_keys_str_mv AT gudysadam quickprobs2towardsrapidconstructionofhighqualityalignmentsoflargeproteinfamilies
AT deorowiczsebastian quickprobs2towardsrapidconstructionofhighqualityalignmentsoflargeproteinfamilies