Cargando…

NLPEI: A Novel Self-Interacting Protein Prediction Model Based on Natural Language Processing and Evolutionary Information

The study of protein self-interactions (SIPs) can not only reveal the function of proteins at the molecular level, but is also crucial to understand activities such as growth, development, differentiation, and apoptosis, providing an important theoretical basis for exploring the mechanism of major d...

Descripción completa

Detalles Bibliográficos
Autores principales:	Jia, Li-Na, Yan, Xin, You, Zhu-Hong, Zhou, Xi, Li, Li-Ping, Wang, Lei, Song, Ke-Jian
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	SAGE Publications 2020
Materias:	Original Research
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7768313/ https://www.ncbi.nlm.nih.gov/pubmed/33488064 http://dx.doi.org/10.1177/1176934320984171

_version_	1783629129754607616
author	Jia, Li-Na Yan, Xin You, Zhu-Hong Zhou, Xi Li, Li-Ping Wang, Lei Song, Ke-Jian
author_facet	Jia, Li-Na Yan, Xin You, Zhu-Hong Zhou, Xi Li, Li-Ping Wang, Lei Song, Ke-Jian
author_sort	Jia, Li-Na
collection	PubMed
description	The study of protein self-interactions (SIPs) can not only reveal the function of proteins at the molecular level, but is also crucial to understand activities such as growth, development, differentiation, and apoptosis, providing an important theoretical basis for exploring the mechanism of major diseases. With the rapid advances in biotechnology, a large number of SIPs have been discovered. However, due to the long period and high cost inherent to biological experiments, the gap between the identification of SIPs and the accumulation of data is growing. Therefore, fast and accurate computational methods are needed to effectively predict SIPs. In this study, we designed a new method, NLPEI, for predicting SIPs based on natural language understanding theory and evolutionary information. Specifically, we first understand the protein sequence as natural language and use natural language processing algorithms to extract its features. Then, we use the Position-Specific Scoring Matrix (PSSM) to represent the evolutionary information of the protein and extract its features through the Stacked Auto-Encoder (SAE) algorithm of deep learning. Finally, we fuse the natural language features of proteins with evolutionary features and make accurate predictions by Extreme Learning Machine (ELM) classifier. In the SIPs gold standard data sets of human and yeast, NLPEI achieved 94.19% and 91.29% prediction accuracy. Compared with different classifier models, different feature models, and other existing methods, NLPEI obtained the best results. These experimental results indicated that NLPEI is an effective tool for predicting SIPs and can provide reliable candidates for biological experiments.
format	Online Article Text
id	pubmed-7768313
institution	National Center for Biotechnology Information
language	English
publishDate	2020
publisher	SAGE Publications
record_format	MEDLINE/PubMed
spelling	pubmed-77683132021-01-21 NLPEI: A Novel Self-Interacting Protein Prediction Model Based on Natural Language Processing and Evolutionary Information Jia, Li-Na Yan, Xin You, Zhu-Hong Zhou, Xi Li, Li-Ping Wang, Lei Song, Ke-Jian Evol Bioinform Online Original Research The study of protein self-interactions (SIPs) can not only reveal the function of proteins at the molecular level, but is also crucial to understand activities such as growth, development, differentiation, and apoptosis, providing an important theoretical basis for exploring the mechanism of major diseases. With the rapid advances in biotechnology, a large number of SIPs have been discovered. However, due to the long period and high cost inherent to biological experiments, the gap between the identification of SIPs and the accumulation of data is growing. Therefore, fast and accurate computational methods are needed to effectively predict SIPs. In this study, we designed a new method, NLPEI, for predicting SIPs based on natural language understanding theory and evolutionary information. Specifically, we first understand the protein sequence as natural language and use natural language processing algorithms to extract its features. Then, we use the Position-Specific Scoring Matrix (PSSM) to represent the evolutionary information of the protein and extract its features through the Stacked Auto-Encoder (SAE) algorithm of deep learning. Finally, we fuse the natural language features of proteins with evolutionary features and make accurate predictions by Extreme Learning Machine (ELM) classifier. In the SIPs gold standard data sets of human and yeast, NLPEI achieved 94.19% and 91.29% prediction accuracy. Compared with different classifier models, different feature models, and other existing methods, NLPEI obtained the best results. These experimental results indicated that NLPEI is an effective tool for predicting SIPs and can provide reliable candidates for biological experiments. SAGE Publications 2020-12-26 /pmc/articles/PMC7768313/ /pubmed/33488064 http://dx.doi.org/10.1177/1176934320984171 Text en © The Author(s) 2020 https://creativecommons.org/licenses/by-nc/4.0/ This article is distributed under the terms of the Creative Commons Attribution-NonCommercial 4.0 License (https://creativecommons.org/licenses/by-nc/4.0/) which permits non-commercial use, reproduction and distribution of the work without further permission provided the original work is attributed as specified on the SAGE and Open Access pages (https://us.sagepub.com/en-us/nam/open-access-at-sage).
spellingShingle	Original Research Jia, Li-Na Yan, Xin You, Zhu-Hong Zhou, Xi Li, Li-Ping Wang, Lei Song, Ke-Jian NLPEI: A Novel Self-Interacting Protein Prediction Model Based on Natural Language Processing and Evolutionary Information
title	NLPEI: A Novel Self-Interacting Protein Prediction Model Based on Natural Language Processing and Evolutionary Information
title_full	NLPEI: A Novel Self-Interacting Protein Prediction Model Based on Natural Language Processing and Evolutionary Information
title_fullStr	NLPEI: A Novel Self-Interacting Protein Prediction Model Based on Natural Language Processing and Evolutionary Information
title_full_unstemmed	NLPEI: A Novel Self-Interacting Protein Prediction Model Based on Natural Language Processing and Evolutionary Information
title_short	NLPEI: A Novel Self-Interacting Protein Prediction Model Based on Natural Language Processing and Evolutionary Information
title_sort	nlpei: a novel self-interacting protein prediction model based on natural language processing and evolutionary information
topic	Original Research
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7768313/ https://www.ncbi.nlm.nih.gov/pubmed/33488064 http://dx.doi.org/10.1177/1176934320984171
work_keys_str_mv	AT jialina nlpeianovelselfinteractingproteinpredictionmodelbasedonnaturallanguageprocessingandevolutionaryinformation AT yanxin nlpeianovelselfinteractingproteinpredictionmodelbasedonnaturallanguageprocessingandevolutionaryinformation AT youzhuhong nlpeianovelselfinteractingproteinpredictionmodelbasedonnaturallanguageprocessingandevolutionaryinformation AT zhouxi nlpeianovelselfinteractingproteinpredictionmodelbasedonnaturallanguageprocessingandevolutionaryinformation AT liliping nlpeianovelselfinteractingproteinpredictionmodelbasedonnaturallanguageprocessingandevolutionaryinformation AT wanglei nlpeianovelselfinteractingproteinpredictionmodelbasedonnaturallanguageprocessingandevolutionaryinformation AT songkejian nlpeianovelselfinteractingproteinpredictionmodelbasedonnaturallanguageprocessingandevolutionaryinformation

NLPEI: A Novel Self-Interacting Protein Prediction Model Based on Natural Language Processing and Evolutionary Information

Ejemplares similares