Cargando…
Sequence alignment by passing messages
BACKGROUND: Sequence alignment has become an indispensable tool in modern molecular biology research, and probabilistic sequence alignment models have been shown to provide an effective framework for building accurate sequence alignment tools. One such example is the pair hidden Markov model (pair-H...
Autor principal: | |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4046711/ https://www.ncbi.nlm.nih.gov/pubmed/24564436 http://dx.doi.org/10.1186/1471-2164-15-S1-S14 |
_version_ | 1782480302280540160 |
---|---|
author | Yoon, Byung-Jun |
author_facet | Yoon, Byung-Jun |
author_sort | Yoon, Byung-Jun |
collection | PubMed |
description | BACKGROUND: Sequence alignment has become an indispensable tool in modern molecular biology research, and probabilistic sequence alignment models have been shown to provide an effective framework for building accurate sequence alignment tools. One such example is the pair hidden Markov model (pair-HMM), which has been especially popular in comparative sequence analysis for several reasons, including their effectiveness in modeling and detecting sequence homology, model simplicity, and the existence of efficient algorithms for applying the model to sequence alignment problems. However, despite these advantages, pair-HMMs also have a number of practical limitations that may degrade their alignment performance or render them unsuitable for certain alignment tasks. RESULTS: In this work, we propose a novel scheme for comparing and aligning biological sequences that can effectively address the shortcomings of the traditional pair-HMMs. The proposed scheme is based on a simple message-passing approach, where messages are exchanged between neighboring symbol pairs that may be potentially aligned in the optimal sequence alignment. The message-passing process yields probabilistic symbol alignment confidence scores, which may be used for predicting the optimal alignment that maximizes the expected number of correctly aligned symbol pairs. CONCLUSIONS: Extensive performance evaluation on protein alignment benchmark datasets shows that the proposed message-passing scheme clearly outperforms the traditional pair-HMM-based approach, in terms of both alignment accuracy and computational efficiency. Furthermore, the proposed scheme is numerically robust and amenable to massive parallelization. |
format | Online Article Text |
id | pubmed-4046711 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-40467112014-06-06 Sequence alignment by passing messages Yoon, Byung-Jun BMC Genomics Proceedings BACKGROUND: Sequence alignment has become an indispensable tool in modern molecular biology research, and probabilistic sequence alignment models have been shown to provide an effective framework for building accurate sequence alignment tools. One such example is the pair hidden Markov model (pair-HMM), which has been especially popular in comparative sequence analysis for several reasons, including their effectiveness in modeling and detecting sequence homology, model simplicity, and the existence of efficient algorithms for applying the model to sequence alignment problems. However, despite these advantages, pair-HMMs also have a number of practical limitations that may degrade their alignment performance or render them unsuitable for certain alignment tasks. RESULTS: In this work, we propose a novel scheme for comparing and aligning biological sequences that can effectively address the shortcomings of the traditional pair-HMMs. The proposed scheme is based on a simple message-passing approach, where messages are exchanged between neighboring symbol pairs that may be potentially aligned in the optimal sequence alignment. The message-passing process yields probabilistic symbol alignment confidence scores, which may be used for predicting the optimal alignment that maximizes the expected number of correctly aligned symbol pairs. CONCLUSIONS: Extensive performance evaluation on protein alignment benchmark datasets shows that the proposed message-passing scheme clearly outperforms the traditional pair-HMM-based approach, in terms of both alignment accuracy and computational efficiency. Furthermore, the proposed scheme is numerically robust and amenable to massive parallelization. BioMed Central 2014-01-24 /pmc/articles/PMC4046711/ /pubmed/24564436 http://dx.doi.org/10.1186/1471-2164-15-S1-S14 Text en © Yoon; licensee BioMed Central Ltd. 2014 This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Proceedings Yoon, Byung-Jun Sequence alignment by passing messages |
title | Sequence alignment by passing messages |
title_full | Sequence alignment by passing messages |
title_fullStr | Sequence alignment by passing messages |
title_full_unstemmed | Sequence alignment by passing messages |
title_short | Sequence alignment by passing messages |
title_sort | sequence alignment by passing messages |
topic | Proceedings |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4046711/ https://www.ncbi.nlm.nih.gov/pubmed/24564436 http://dx.doi.org/10.1186/1471-2164-15-S1-S14 |
work_keys_str_mv | AT yoonbyungjun sequencealignmentbypassingmessages |