Cargando…

Identifying Intrinsically Disordered Protein Regions through a Deep Neural Network with Three Novel Sequence Features

The fast, reliable, and accurate identification of IDPRs is essential, as in recent years it has come to be recognized more and more that IDPRs have a wide impact on many important physiological processes, such as molecular recognition and molecular assembly, the regulation of transcription and tran...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhao, Jiaxiang, Wang, Zengke
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8950681/
https://www.ncbi.nlm.nih.gov/pubmed/35330096
http://dx.doi.org/10.3390/life12030345
_version_ 1784675200998047744
author Zhao, Jiaxiang
Wang, Zengke
author_facet Zhao, Jiaxiang
Wang, Zengke
author_sort Zhao, Jiaxiang
collection PubMed
description The fast, reliable, and accurate identification of IDPRs is essential, as in recent years it has come to be recognized more and more that IDPRs have a wide impact on many important physiological processes, such as molecular recognition and molecular assembly, the regulation of transcription and translation, protein phosphorylation, cellular signal transduction, etc. For the sake of cost-effectiveness, it is imperative to develop computational approaches for identifying IDPRs. In this study, a deep neural structure where a variant VGG19 is situated between two MLP networks is developed for identifying IDPRs. Furthermore, for the first time, three novel sequence features—i.e., persistent entropy and the probabilities associated with two and three consecutive amino acids of the protein sequence—are introduced for identifying IDPRs. The simulation results show that our neural structure either performs considerably better than other known methods or, when relying on a much smaller training set, attains a similar performance. Our deep neural structure, which exploits the VGG19 structure, is effective for identifying IDPRs. Furthermore, three novel sequence features—i.e., the persistent entropy and the probabilities associated with two and three consecutive amino acids of the protein sequence—could be used as valuable sequence features in the further development of identifying IDPRs.
format Online
Article
Text
id pubmed-8950681
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-89506812022-03-26 Identifying Intrinsically Disordered Protein Regions through a Deep Neural Network with Three Novel Sequence Features Zhao, Jiaxiang Wang, Zengke Life (Basel) Article The fast, reliable, and accurate identification of IDPRs is essential, as in recent years it has come to be recognized more and more that IDPRs have a wide impact on many important physiological processes, such as molecular recognition and molecular assembly, the regulation of transcription and translation, protein phosphorylation, cellular signal transduction, etc. For the sake of cost-effectiveness, it is imperative to develop computational approaches for identifying IDPRs. In this study, a deep neural structure where a variant VGG19 is situated between two MLP networks is developed for identifying IDPRs. Furthermore, for the first time, three novel sequence features—i.e., persistent entropy and the probabilities associated with two and three consecutive amino acids of the protein sequence—are introduced for identifying IDPRs. The simulation results show that our neural structure either performs considerably better than other known methods or, when relying on a much smaller training set, attains a similar performance. Our deep neural structure, which exploits the VGG19 structure, is effective for identifying IDPRs. Furthermore, three novel sequence features—i.e., the persistent entropy and the probabilities associated with two and three consecutive amino acids of the protein sequence—could be used as valuable sequence features in the further development of identifying IDPRs. MDPI 2022-02-26 /pmc/articles/PMC8950681/ /pubmed/35330096 http://dx.doi.org/10.3390/life12030345 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Zhao, Jiaxiang
Wang, Zengke
Identifying Intrinsically Disordered Protein Regions through a Deep Neural Network with Three Novel Sequence Features
title Identifying Intrinsically Disordered Protein Regions through a Deep Neural Network with Three Novel Sequence Features
title_full Identifying Intrinsically Disordered Protein Regions through a Deep Neural Network with Three Novel Sequence Features
title_fullStr Identifying Intrinsically Disordered Protein Regions through a Deep Neural Network with Three Novel Sequence Features
title_full_unstemmed Identifying Intrinsically Disordered Protein Regions through a Deep Neural Network with Three Novel Sequence Features
title_short Identifying Intrinsically Disordered Protein Regions through a Deep Neural Network with Three Novel Sequence Features
title_sort identifying intrinsically disordered protein regions through a deep neural network with three novel sequence features
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8950681/
https://www.ncbi.nlm.nih.gov/pubmed/35330096
http://dx.doi.org/10.3390/life12030345
work_keys_str_mv AT zhaojiaxiang identifyingintrinsicallydisorderedproteinregionsthroughadeepneuralnetworkwiththreenovelsequencefeatures
AT wangzengke identifyingintrinsicallydisorderedproteinregionsthroughadeepneuralnetworkwiththreenovelsequencefeatures