Cargando…

Multiple sequence alignment based on deep reinforcement learning with self-attention and positional encoding

MOTIVATION: Multiple sequence alignment (MSA) is one of the hotspots of current research and is commonly used in sequence analysis scenarios. However, there is no lasting solution for MSA because it is a Nondeterministic Polynomially complete problem, and the existing methods still have room to impr...

Descripción completa

Detalles Bibliográficos
Autores principales:	Liu, Yuhang, Yuan, Hao, Zhang, Qiang, Wang, Zixuan, Xiong, Shuwen, Wen, Naifeng, Zhang, Yongqing
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Oxford University Press 2023
Materias:	Original Paper
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10628385/ https://www.ncbi.nlm.nih.gov/pubmed/37856335 http://dx.doi.org/10.1093/bioinformatics/btad636

_version_	1785131745156268032
author	Liu, Yuhang Yuan, Hao Zhang, Qiang Wang, Zixuan Xiong, Shuwen Wen, Naifeng Zhang, Yongqing
author_facet	Liu, Yuhang Yuan, Hao Zhang, Qiang Wang, Zixuan Xiong, Shuwen Wen, Naifeng Zhang, Yongqing
author_sort	Liu, Yuhang
collection	PubMed
description	MOTIVATION: Multiple sequence alignment (MSA) is one of the hotspots of current research and is commonly used in sequence analysis scenarios. However, there is no lasting solution for MSA because it is a Nondeterministic Polynomially complete problem, and the existing methods still have room to improve the accuracy. RESULTS: We propose Deep reinforcement learning with Positional encoding and self-Attention for MSA, based on deep reinforcement learning, to enhance the accuracy of the alignment Specifically, inspired by the translation technique in natural language processing, we introduce self-attention and positional encoding to improve accuracy and reliability. Firstly, positional encoding encodes the position of the sequence to prevent the loss of nucleotide position information. Secondly, the self-attention model is used to extract the key features of the sequence. Then input the features into a multi-layer perceptron, which can calculate the insertion position of the gap according to the features. In addition, a novel reinforcement learning environment is designed to convert the classic progressive alignment into progressive column alignment, gradually generating each column’s sub-alignment. Finally, merge the sub-alignment into the complete alignment. Extensive experiments based on several datasets validate our method’s effectiveness for MSA, outperforming some state-of-the-art methods in terms of the Sum-of-pairs and Column scores. AVAILABILITY AND IMPLEMENTATION: The process is implemented in Python and available as open-source software from https://github.com/ZhangLab312/DPAMSA.
format	Online Article Text
id	pubmed-10628385
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	Oxford University Press
record_format	MEDLINE/PubMed
spelling	pubmed-106283852023-11-08 Multiple sequence alignment based on deep reinforcement learning with self-attention and positional encoding Liu, Yuhang Yuan, Hao Zhang, Qiang Wang, Zixuan Xiong, Shuwen Wen, Naifeng Zhang, Yongqing Bioinformatics Original Paper MOTIVATION: Multiple sequence alignment (MSA) is one of the hotspots of current research and is commonly used in sequence analysis scenarios. However, there is no lasting solution for MSA because it is a Nondeterministic Polynomially complete problem, and the existing methods still have room to improve the accuracy. RESULTS: We propose Deep reinforcement learning with Positional encoding and self-Attention for MSA, based on deep reinforcement learning, to enhance the accuracy of the alignment Specifically, inspired by the translation technique in natural language processing, we introduce self-attention and positional encoding to improve accuracy and reliability. Firstly, positional encoding encodes the position of the sequence to prevent the loss of nucleotide position information. Secondly, the self-attention model is used to extract the key features of the sequence. Then input the features into a multi-layer perceptron, which can calculate the insertion position of the gap according to the features. In addition, a novel reinforcement learning environment is designed to convert the classic progressive alignment into progressive column alignment, gradually generating each column’s sub-alignment. Finally, merge the sub-alignment into the complete alignment. Extensive experiments based on several datasets validate our method’s effectiveness for MSA, outperforming some state-of-the-art methods in terms of the Sum-of-pairs and Column scores. AVAILABILITY AND IMPLEMENTATION: The process is implemented in Python and available as open-source software from https://github.com/ZhangLab312/DPAMSA. Oxford University Press 2023-10-19 /pmc/articles/PMC10628385/ /pubmed/37856335 http://dx.doi.org/10.1093/bioinformatics/btad636 Text en © The Author(s) 2023. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Original Paper Liu, Yuhang Yuan, Hao Zhang, Qiang Wang, Zixuan Xiong, Shuwen Wen, Naifeng Zhang, Yongqing Multiple sequence alignment based on deep reinforcement learning with self-attention and positional encoding
title	Multiple sequence alignment based on deep reinforcement learning with self-attention and positional encoding
title_full	Multiple sequence alignment based on deep reinforcement learning with self-attention and positional encoding
title_fullStr	Multiple sequence alignment based on deep reinforcement learning with self-attention and positional encoding
title_full_unstemmed	Multiple sequence alignment based on deep reinforcement learning with self-attention and positional encoding
title_short	Multiple sequence alignment based on deep reinforcement learning with self-attention and positional encoding
title_sort	multiple sequence alignment based on deep reinforcement learning with self-attention and positional encoding
topic	Original Paper
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10628385/ https://www.ncbi.nlm.nih.gov/pubmed/37856335 http://dx.doi.org/10.1093/bioinformatics/btad636
work_keys_str_mv	AT liuyuhang multiplesequencealignmentbasedondeepreinforcementlearningwithselfattentionandpositionalencoding AT yuanhao multiplesequencealignmentbasedondeepreinforcementlearningwithselfattentionandpositionalencoding AT zhangqiang multiplesequencealignmentbasedondeepreinforcementlearningwithselfattentionandpositionalencoding AT wangzixuan multiplesequencealignmentbasedondeepreinforcementlearningwithselfattentionandpositionalencoding AT xiongshuwen multiplesequencealignmentbasedondeepreinforcementlearningwithselfattentionandpositionalencoding AT wennaifeng multiplesequencealignmentbasedondeepreinforcementlearningwithselfattentionandpositionalencoding AT zhangyongqing multiplesequencealignmentbasedondeepreinforcementlearningwithselfattentionandpositionalencoding

Multiple sequence alignment based on deep reinforcement learning with self-attention and positional encoding

Ejemplares similares