Cargando…
BERT-Kgly: A Bidirectional Encoder Representations From Transformers (BERT)-Based Model for Predicting Lysine Glycation Site for Homo sapiens
As one of the most important posttranslational modifications (PTMs), protein lysine glycation changes the characteristics of the proteins and leads to the dysfunction of the proteins, which may cause diseases. Accurately detecting the glycation sites is of great benefit for understanding the biologi...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9580886/ https://www.ncbi.nlm.nih.gov/pubmed/36304324 http://dx.doi.org/10.3389/fbinf.2022.834153 |
_version_ | 1784812493289291776 |
---|---|
author | Liu, Yinbo Liu, Yufeng Wang, Gang-Ao Cheng, Yinchu Bi, Shoudong Zhu, Xiaolei |
author_facet | Liu, Yinbo Liu, Yufeng Wang, Gang-Ao Cheng, Yinchu Bi, Shoudong Zhu, Xiaolei |
author_sort | Liu, Yinbo |
collection | PubMed |
description | As one of the most important posttranslational modifications (PTMs), protein lysine glycation changes the characteristics of the proteins and leads to the dysfunction of the proteins, which may cause diseases. Accurately detecting the glycation sites is of great benefit for understanding the biological function and potential mechanism of glycation in the treatment of diseases. However, experimental methods are expensive and time-consuming for lysine glycation site identification. Instead, computational methods, with their higher efficiency and lower cost, could be an important supplement to the experimental methods. In this study, we proposed a novel predictor, BERT-Kgly, for protein lysine glycation site prediction, which was developed by extracting embedding features of protein segments from pretrained Bidirectional Encoder Representations from Transformers (BERT) models. Three pretrained BERT models were explored to get the embeddings with optimal representability, and three downstream deep networks were employed to build our models. Our results showed that the model based on embeddings extracted from the BERT model pretrained on 556,603 protein sequences of UniProt outperforms other models. In addition, an independent test set was used to evaluate and compare our model with other existing methods, which indicated that our model was superior to other existing models. |
format | Online Article Text |
id | pubmed-9580886 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-95808862022-10-26 BERT-Kgly: A Bidirectional Encoder Representations From Transformers (BERT)-Based Model for Predicting Lysine Glycation Site for Homo sapiens Liu, Yinbo Liu, Yufeng Wang, Gang-Ao Cheng, Yinchu Bi, Shoudong Zhu, Xiaolei Front Bioinform Bioinformatics As one of the most important posttranslational modifications (PTMs), protein lysine glycation changes the characteristics of the proteins and leads to the dysfunction of the proteins, which may cause diseases. Accurately detecting the glycation sites is of great benefit for understanding the biological function and potential mechanism of glycation in the treatment of diseases. However, experimental methods are expensive and time-consuming for lysine glycation site identification. Instead, computational methods, with their higher efficiency and lower cost, could be an important supplement to the experimental methods. In this study, we proposed a novel predictor, BERT-Kgly, for protein lysine glycation site prediction, which was developed by extracting embedding features of protein segments from pretrained Bidirectional Encoder Representations from Transformers (BERT) models. Three pretrained BERT models were explored to get the embeddings with optimal representability, and three downstream deep networks were employed to build our models. Our results showed that the model based on embeddings extracted from the BERT model pretrained on 556,603 protein sequences of UniProt outperforms other models. In addition, an independent test set was used to evaluate and compare our model with other existing methods, which indicated that our model was superior to other existing models. Frontiers Media S.A. 2022-02-18 /pmc/articles/PMC9580886/ /pubmed/36304324 http://dx.doi.org/10.3389/fbinf.2022.834153 Text en Copyright © 2022 Liu, Liu, Wang, Cheng, Bi and Zhu. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Bioinformatics Liu, Yinbo Liu, Yufeng Wang, Gang-Ao Cheng, Yinchu Bi, Shoudong Zhu, Xiaolei BERT-Kgly: A Bidirectional Encoder Representations From Transformers (BERT)-Based Model for Predicting Lysine Glycation Site for Homo sapiens |
title | BERT-Kgly: A Bidirectional Encoder Representations From Transformers (BERT)-Based Model for Predicting Lysine Glycation Site for Homo sapiens
|
title_full | BERT-Kgly: A Bidirectional Encoder Representations From Transformers (BERT)-Based Model for Predicting Lysine Glycation Site for Homo sapiens
|
title_fullStr | BERT-Kgly: A Bidirectional Encoder Representations From Transformers (BERT)-Based Model for Predicting Lysine Glycation Site for Homo sapiens
|
title_full_unstemmed | BERT-Kgly: A Bidirectional Encoder Representations From Transformers (BERT)-Based Model for Predicting Lysine Glycation Site for Homo sapiens
|
title_short | BERT-Kgly: A Bidirectional Encoder Representations From Transformers (BERT)-Based Model for Predicting Lysine Glycation Site for Homo sapiens
|
title_sort | bert-kgly: a bidirectional encoder representations from transformers (bert)-based model for predicting lysine glycation site for homo sapiens |
topic | Bioinformatics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9580886/ https://www.ncbi.nlm.nih.gov/pubmed/36304324 http://dx.doi.org/10.3389/fbinf.2022.834153 |
work_keys_str_mv | AT liuyinbo bertkglyabidirectionalencoderrepresentationsfromtransformersbertbasedmodelforpredictinglysineglycationsiteforhomosapiens AT liuyufeng bertkglyabidirectionalencoderrepresentationsfromtransformersbertbasedmodelforpredictinglysineglycationsiteforhomosapiens AT wanggangao bertkglyabidirectionalencoderrepresentationsfromtransformersbertbasedmodelforpredictinglysineglycationsiteforhomosapiens AT chengyinchu bertkglyabidirectionalencoderrepresentationsfromtransformersbertbasedmodelforpredictinglysineglycationsiteforhomosapiens AT bishoudong bertkglyabidirectionalencoderrepresentationsfromtransformersbertbasedmodelforpredictinglysineglycationsiteforhomosapiens AT zhuxiaolei bertkglyabidirectionalencoderrepresentationsfromtransformersbertbasedmodelforpredictinglysineglycationsiteforhomosapiens |