Cargando…

Deep Neural Network Framework Based on Word Embedding for Protein Glutarylation Sites Prediction

In recent years, much research has found that dysregulation of glutarylation is associated with many human diseases, such as diabetes, cancer, and glutaric aciduria type I. Therefore, glutarylation identification and characterization are essential tasks for determining modification-specific proteomi...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Chuan-Ming, Ta, Van-Dai, Le, Nguyen Quoc Khanh, Tadesse, Direselign Addis, Shi, Chongyang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9410500/
https://www.ncbi.nlm.nih.gov/pubmed/36013392
http://dx.doi.org/10.3390/life12081213
_version_ 1784775108380852224
author Liu, Chuan-Ming
Ta, Van-Dai
Le, Nguyen Quoc Khanh
Tadesse, Direselign Addis
Shi, Chongyang
author_facet Liu, Chuan-Ming
Ta, Van-Dai
Le, Nguyen Quoc Khanh
Tadesse, Direselign Addis
Shi, Chongyang
author_sort Liu, Chuan-Ming
collection PubMed
description In recent years, much research has found that dysregulation of glutarylation is associated with many human diseases, such as diabetes, cancer, and glutaric aciduria type I. Therefore, glutarylation identification and characterization are essential tasks for determining modification-specific proteomics. This study aims to propose a novel deep neural network framework based on word embedding techniques for glutarylation sites prediction. Multiple deep neural network models are implemented to evaluate the performance of glutarylation sites prediction. Furthermore, an extensive experimental comparison of word embedding techniques is conducted to utilize the most efficient method for improving protein sequence data representation. The results suggest that the proposed deep neural networks not only improve protein sequence representation but also work effectively in glutarylation sites prediction by obtaining a higher accuracy and confidence rate compared to the previous work. Moreover, embedding techniques were proven to be more productive than the pre-trained word embedding techniques for glutarylation sequence representation. Our proposed method has significantly outperformed all traditional performance metrics compared to the advanced integrated vector support, with accuracy, specificity, sensitivity, and correlation coefficient of 0.79, 0.89, 0.59, and 0.51, respectively. It shows the potential to detect new glutarylation sites and uncover the relationships between glutarylation and well-known lysine modification.
format Online
Article
Text
id pubmed-9410500
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-94105002022-08-26 Deep Neural Network Framework Based on Word Embedding for Protein Glutarylation Sites Prediction Liu, Chuan-Ming Ta, Van-Dai Le, Nguyen Quoc Khanh Tadesse, Direselign Addis Shi, Chongyang Life (Basel) Article In recent years, much research has found that dysregulation of glutarylation is associated with many human diseases, such as diabetes, cancer, and glutaric aciduria type I. Therefore, glutarylation identification and characterization are essential tasks for determining modification-specific proteomics. This study aims to propose a novel deep neural network framework based on word embedding techniques for glutarylation sites prediction. Multiple deep neural network models are implemented to evaluate the performance of glutarylation sites prediction. Furthermore, an extensive experimental comparison of word embedding techniques is conducted to utilize the most efficient method for improving protein sequence data representation. The results suggest that the proposed deep neural networks not only improve protein sequence representation but also work effectively in glutarylation sites prediction by obtaining a higher accuracy and confidence rate compared to the previous work. Moreover, embedding techniques were proven to be more productive than the pre-trained word embedding techniques for glutarylation sequence representation. Our proposed method has significantly outperformed all traditional performance metrics compared to the advanced integrated vector support, with accuracy, specificity, sensitivity, and correlation coefficient of 0.79, 0.89, 0.59, and 0.51, respectively. It shows the potential to detect new glutarylation sites and uncover the relationships between glutarylation and well-known lysine modification. MDPI 2022-08-10 /pmc/articles/PMC9410500/ /pubmed/36013392 http://dx.doi.org/10.3390/life12081213 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Liu, Chuan-Ming
Ta, Van-Dai
Le, Nguyen Quoc Khanh
Tadesse, Direselign Addis
Shi, Chongyang
Deep Neural Network Framework Based on Word Embedding for Protein Glutarylation Sites Prediction
title Deep Neural Network Framework Based on Word Embedding for Protein Glutarylation Sites Prediction
title_full Deep Neural Network Framework Based on Word Embedding for Protein Glutarylation Sites Prediction
title_fullStr Deep Neural Network Framework Based on Word Embedding for Protein Glutarylation Sites Prediction
title_full_unstemmed Deep Neural Network Framework Based on Word Embedding for Protein Glutarylation Sites Prediction
title_short Deep Neural Network Framework Based on Word Embedding for Protein Glutarylation Sites Prediction
title_sort deep neural network framework based on word embedding for protein glutarylation sites prediction
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9410500/
https://www.ncbi.nlm.nih.gov/pubmed/36013392
http://dx.doi.org/10.3390/life12081213
work_keys_str_mv AT liuchuanming deepneuralnetworkframeworkbasedonwordembeddingforproteinglutarylationsitesprediction
AT tavandai deepneuralnetworkframeworkbasedonwordembeddingforproteinglutarylationsitesprediction
AT lenguyenquockhanh deepneuralnetworkframeworkbasedonwordembeddingforproteinglutarylationsitesprediction
AT tadessedireselignaddis deepneuralnetworkframeworkbasedonwordembeddingforproteinglutarylationsitesprediction
AT shichongyang deepneuralnetworkframeworkbasedonwordembeddingforproteinglutarylationsitesprediction