Cargando…
QuickProbs 2: Towards rapid construction of high-quality alignments of large protein families
The ever-increasing size of sequence databases caused by the development of high throughput sequencing, poses to multiple alignment algorithms one of the greatest challenges yet. As we show, well-established techniques employed for increasing alignment quality, i.e., refinement and consistency, are...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5282490/ https://www.ncbi.nlm.nih.gov/pubmed/28139687 http://dx.doi.org/10.1038/srep41553 |
_version_ | 1782503331702243328 |
---|---|
author | Gudyś, Adam Deorowicz, Sebastian |
author_facet | Gudyś, Adam Deorowicz, Sebastian |
author_sort | Gudyś, Adam |
collection | PubMed |
description | The ever-increasing size of sequence databases caused by the development of high throughput sequencing, poses to multiple alignment algorithms one of the greatest challenges yet. As we show, well-established techniques employed for increasing alignment quality, i.e., refinement and consistency, are ineffective when large protein families are investigated. We present QuickProbs 2, an algorithm for multiple sequence alignment. Based on probabilistic models, equipped with novel column-oriented refinement and selective consistency, it offers outstanding accuracy. When analysing hundreds of sequences, Quick-Probs 2 is noticeably better than ClustalΩ and MAFFT, the previous leaders for processing numerous protein families. In the case of smaller sets, for which consistency-based methods are the best performing, QuickProbs 2 is also superior to the competitors. Due to low computational requirements of selective consistency and utilization of massively parallel architectures, presented algorithm has similar execution times to ClustalΩ, and is orders of magnitude faster than full consistency approaches, like MSAProbs or PicXAA. All these make QuickProbs 2 an excellent tool for aligning families ranging from few, to hundreds of proteins. |
format | Online Article Text |
id | pubmed-5282490 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | Nature Publishing Group |
record_format | MEDLINE/PubMed |
spelling | pubmed-52824902017-02-03 QuickProbs 2: Towards rapid construction of high-quality alignments of large protein families Gudyś, Adam Deorowicz, Sebastian Sci Rep Article The ever-increasing size of sequence databases caused by the development of high throughput sequencing, poses to multiple alignment algorithms one of the greatest challenges yet. As we show, well-established techniques employed for increasing alignment quality, i.e., refinement and consistency, are ineffective when large protein families are investigated. We present QuickProbs 2, an algorithm for multiple sequence alignment. Based on probabilistic models, equipped with novel column-oriented refinement and selective consistency, it offers outstanding accuracy. When analysing hundreds of sequences, Quick-Probs 2 is noticeably better than ClustalΩ and MAFFT, the previous leaders for processing numerous protein families. In the case of smaller sets, for which consistency-based methods are the best performing, QuickProbs 2 is also superior to the competitors. Due to low computational requirements of selective consistency and utilization of massively parallel architectures, presented algorithm has similar execution times to ClustalΩ, and is orders of magnitude faster than full consistency approaches, like MSAProbs or PicXAA. All these make QuickProbs 2 an excellent tool for aligning families ranging from few, to hundreds of proteins. Nature Publishing Group 2017-01-31 /pmc/articles/PMC5282490/ /pubmed/28139687 http://dx.doi.org/10.1038/srep41553 Text en Copyright © 2017, The Author(s) http://creativecommons.org/licenses/by/4.0/ This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ |
spellingShingle | Article Gudyś, Adam Deorowicz, Sebastian QuickProbs 2: Towards rapid construction of high-quality alignments of large protein families |
title | QuickProbs 2: Towards rapid construction of high-quality alignments of large protein families |
title_full | QuickProbs 2: Towards rapid construction of high-quality alignments of large protein families |
title_fullStr | QuickProbs 2: Towards rapid construction of high-quality alignments of large protein families |
title_full_unstemmed | QuickProbs 2: Towards rapid construction of high-quality alignments of large protein families |
title_short | QuickProbs 2: Towards rapid construction of high-quality alignments of large protein families |
title_sort | quickprobs 2: towards rapid construction of high-quality alignments of large protein families |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5282490/ https://www.ncbi.nlm.nih.gov/pubmed/28139687 http://dx.doi.org/10.1038/srep41553 |
work_keys_str_mv | AT gudysadam quickprobs2towardsrapidconstructionofhighqualityalignmentsoflargeproteinfamilies AT deorowiczsebastian quickprobs2towardsrapidconstructionofhighqualityalignmentsoflargeproteinfamilies |